docs(sdks): Add spec for dataCollection option to supersede sendDefaultPii#16796
docs(sdks): Add spec for dataCollection option to supersede sendDefaultPii#16796
dataCollection option to supersede sendDefaultPii#16796Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
dataCollectiondataCollection option to supersede sendDefaultPii
441172f to
e99813b
Compare
There was a problem hiding this comment.
This page does not exist anymore - we moved the PII related content to
https://develop.sentry.dev/sdk/foundations/data-scrubbing/
de9517c to
ae5020b
Compare
|
We should include some more general information, such as how we expect the default snippet too like now, which is very minimal and in-line with the behavior of and also mention the reasoning of this change. In particular, we should highlight that we now include more context by default, without a change in our position to be privacy first. |
|
|
||
| The following terms **MUST** be included in the default denylist for headers, and **SHOULD** be applied to cookies and query params where applicable: | ||
|
|
||
| `["auth", "token", "secret", "password", "passwd", "pwd", "key", "jwt", "bearer", "sso", "saml", "csrf", "xsrf", "credentials", "session", "sid", "identity"]` |
There was a problem hiding this comment.
m: We have some additional filtered headers on cocoa that may be relevant here (https://github.com/getsentry/sentry-cocoa/blob/main/Sources/Swift/Core/Tools/HTTPHeaderSanitizer.swift#L8): X-REAL-IP and REMOTE-ADDR
| - SDKs **SHOULD** maintain a default denylist of cookie names using the same matching rule (e.g. `session`, `auth`, `identity`). Values for matching cookie names **MUST** be replaced with `"[Filtered]"`. | ||
| - **When individual cookie key-value pairs cannot be extracted** (e.g. malformed or opaque cookie string), the entire `Cookie` or `Set-Cookie` header value **MUST** be replaced with `"[Filtered]"`. Unfiltered raw cookie header values **MUST NOT** be sent. When in doubt, treat the whole cookie header as sensitive. | ||
|
|
||
| #### Request Bodies |
There was a problem hiding this comment.
m: Should the same apply for response bodies? This is (or will be, depends on the SDK) being recorded now for Session Replay
There was a problem hiding this comment.
This configuration is set in SessionReplay configuration, it may be worth aligning there
|
|
||
| ### User-Set Data and Scrubbing | ||
|
|
||
| When the user **explicitly** sets data on the scope (user, request, response, tags, contexts, etc.) or on a span, log, or other telemetry, that data is **not** gated by `dataCollection`. It **MUST** always be attached to outgoing telemetry. The same applies to data the user provides via `beforeSend` or event processors. |
There was a problem hiding this comment.
Thanks for the clarification 👍
| - User identifiers (user ID, username, email) | ||
| - IP address | ||
| - Cookies and headers that identify the user or session | ||
| - HTTP request data (TBD) |
There was a problem hiding this comment.
h: What about request paths?
Some requests may be identifiable, like /user/USER_ID
Should we have a denylist/allowlist for url paths?
There was a problem hiding this comment.
Good question 🤔
What we currently do in JS: When we either know it's a param route, we use the appropriate parametrized route name (e.g. user/:id) as the transaction name but the full URL (e.g. user/123) is still added in the attributes. @cleptric Any opinions on that?
This PR extends the Data Collection spec so it is the single place for what SDKs collect and how they scrub it. It adds concrete denylist behavior, request-body and cookie rules, and pulls in the relevant scrubbing behavior from the Data Handling doc.
/sdk/foundations/client/data-collection/)IS YOUR CHANGE URGENT?
Help us prioritize incoming PRs by letting us know when the change needs to go live.
SLA
Thanks in advance for your help!
PRE-MERGE CHECKLIST
Make sure you've checked the following before merging your changes: