From 33a6fa6b885af5755e9e43ead28445c8bbe042f0 Mon Sep 17 00:00:00 2001
From: s1gr1d <32902192+s1gr1d@users.noreply.github.com>
Date: Wed, 4 Mar 2026 17:13:41 +0100
Subject: [PATCH 1/7] docs(sdks): Add docs on `dataCollection`
---
.../client/data-collection/index.mdx | 335 ++++++++++++++++++
.../sdk/foundations/data-scrubbing.mdx | 56 +--
2 files changed, 342 insertions(+), 49 deletions(-)
create mode 100644 develop-docs/sdk/foundations/client/data-collection/index.mdx
diff --git a/develop-docs/sdk/foundations/client/data-collection/index.mdx b/develop-docs/sdk/foundations/client/data-collection/index.mdx
new file mode 100644
index 0000000000000..76026ac4cf77f
--- /dev/null
+++ b/develop-docs/sdk/foundations/client/data-collection/index.mdx
@@ -0,0 +1,335 @@
+---
+title: Data Collection
+description: Configuration for what data SDKs collect by default — technical context, PII, and sensitive data.
+spec_id: sdk/foundations/client/data-collection
+spec_version: 1.0.0
+spec_status: candidate
+spec_depends_on:
+ - id: sdk/foundations/client
+ version: ">=1.0.0"
+spec_changelog:
+ - version: 1.0.0
+ date: 2025-03-05
+ summary: Initial spec; dataCollection config, three data tiers, cookies/headers denylist, replace sendDefaultPii.
+sidebar_order: 1
+---
+
+
+
+
+
+## Overview
+
+This spec defines how SDKs control **what data is collected automatically** from the runtime (device, requests, responses, user context). It replaces the single `sendDefaultPii` (or platform-equivalent) flag with a structured `dataCollection` configuration so users can enable or restrict collection by category and by field.
+
+Related specs:
+
+- [Data Handling](/sdk/expected-features/data-handling/) — structuring data for scrubbing (spans, breadcrumbs), variable size limits
+- [Client](/sdk/foundations/client/) — client lifecycle and event pipeline
+- [Configuration](/sdk/foundations/client/configuration/) — top-level init options including `send_default_pii` (deprecated in favor of this spec)
+
+---
+
+## Data Tiers
+
+Collected data is grouped into three tiers. SDKs **MUST** treat these tiers consistently when applying defaults and user configuration.
+
+
+
+### 1. Technical Context Data
+
+Non-identifying context used for debugging and performance:
+
+- Device and environment context (OS, runtime, non-PII identifiers)
+- Performance and error context (stack traces, breadcrumbs, span metadata)
+- Framework/routing context where it does not contain PII or secrets
+
+This tier is **not** gated by the data collection configuration. SDKs **MAY** collect it by default.
+
+### 2. PII Data
+
+Personally identifiable or user-linked data:
+
+- User identifiers (user ID, username, email)
+- IP address
+- Cookies and headers that identify the user or session
+- AI Agent input and output messages
+
+This tier **MUST** be off by default unless the user opts in via `includeUserInfo` and/or explicit `collect` allowlists. See [Include User Info](#include-user-info), [collect options](#collect-options), and [Default Denylist](#default-denylist).
+
+### 3. Sensitive Data
+
+Credentials and secrets that must never be sent by default:
+
+- Passwords, tokens, API keys, bearer tokens
+- Header or cookie values that match known sensitive names (auth, token, secret, password, key, jwt, etc.)
+
+SDKs **MUST** never send sensitive **values** through automatic instrumentation; keys are included by the SDK while values are replaced with `"[Filtered]"` (see [Default Denylist](#default-denylist)). Users can use `beforeSend` (or equivalent) to remove or redact keys if needed.
+
+
+
+---
+
+## Configuration Surface
+
+All data-collection options live under a single key: `dataCollection`.
+
+
+
+### Top-Level Shape
+
+At the top level, users **MAY** specify **`includeUserInfo`** and a **`collect`** record.
+
+- **`includeUserInfo`** is the primary toggle for Personally Identifiable Information (PII). It controls whether user-identity fields are included in or excluded from automatic collection, and sets the default for other PII-heavy options (such as `aiAgentMessages`).
+- **`collect`** controls which categories of request/response and runtime data are gathered (cookies, headers, body, query params, etc.); see the sections below.
+
+Users configure data collection via the init options. An example with the default options:
+
+```typescript
+init({
+ dsn: "...",
+ dataCollection: {
+ includeUserInfo: false,
+ collect: {
+ cookies: true,
+ httpHeaders: true,
+ queryParams: true,
+ stackTraceVariables: true,
+ incomingRequestBody: false,
+ outgoingRequestBody: false,
+ aiAgentMessages: false
+ },
+ },
+});
+```
+
+- SDKs **MUST** support at least `includeUserInfo` and the `collect` object. SDKs **MAY** omit options that do not apply to the platform (e.g. no `outgoingRequestBody` on a backend-only SDK).
+
+For how `includeUserInfo` affects the defaults of collection options, see [How Defaults Cascade](#how-defaults-cascade).
+
+
+
+---
+
+## Option Reference
+
+
+
+ ### `includeUserInfo`
+
+ **Type:** Boolean option
+
+ Controls whether the SDK automatically attaches user identity fields to events (e.g. `user.id`, `user.email`, `user.username`, `user.ip_address`). This is the primary PII gate: its value sets the default for all other PII-heavy options, most notably `aiAgentMessages`.
+
+ | Value | Behavior |
+ |-------|----------|
+ | `true` | Attach all user identity fields captured by automatic instrumentation. Equivalent to the legacy `sendDefaultPii` flag scoped to user data. |
+ | `false` | Do not attach user identity fields from automatic instrumentation. |
+
+ - **Default:** `false`.
+ - When user data is set **explicitly** on the scope (or equivalent), it is **always** attached regardless of this setting. See [User-Set Data and Scrubbing](#user-set-data-scrubbing).
+
+
+
+---
+
+
+
+ ### `collect` Options
+
+ Each key under `collect` maps to a category of automatically collected data. Refer to the [Option Types](#option-types) section below to understand which value type each key accepts.
+
+ | Key | Option Type | Default | Description |
+ |-----|-------------|---------|-------------|
+ | `cookies` | Collection | `true` | Include cookie values; keys are filtered by the default denylist or by allow/deny lists. |
+ | `httpHeaders` | Collection | `true` | Include HTTP header values; keys are filtered by the default denylist or by allow/deny lists. |
+ | `queryParams` | Collection | `true` | Include URL query parameter values; keys are filtered by the default denylist or by allow/deny lists. |
+ | `stackTraceVariables` | Boolean | Inherits `includeUserInfo` | Include local variable values captured within stack traces. |
+ | `incomingRequestBody` | Boolean | Inherits `includeUserInfo` | Include full body of the incoming HTTP request. |
+ | `outgoingRequestBody` | Boolean | Inherits `includeUserInfo` | Include full body of outgoing HTTP requests. |
+ | `aiAgentMessages` | Boolean | Inherits `includeUserInfo` | Include AI agent input and output messages. |
+
+
+ Unlike cookies or headers, some data (e.g. request bodies) has no predictable key structure for the SDK to filter. Data can still be redacted in `beforeSend` or event processors if needed.
+
+
+
+
+
+ ### Option Types
+
+ Each option in `dataCollection.collect` uses one of two distinct value types. Which type an option accepts depends on whether the collected data is structured as named key-value pairs (e.g. cookies, headers) or as an opaque blob (e.g. request bodies).
+
+ ---
+
+ #### Boolean Options
+
+ Used for categories where data cannot be meaningfully filtered at the key level — the SDK either collects the entire category or skips it entirely.
+
+ | Value | Behavior |
+ |-------|----------|
+ | `true` | Collect and attach this data category. |
+ | `false` | Do not collect this data category at all. |
+
+ Examples: `incomingRequestBody`, `outgoingRequestBody`, `aiAgentMessages`.
+
+ ---
+
+ #### Collection Options
+
+ Used for categories structured as named key-value pairs, where the SDK can inspect individual keys and apply filtering rules before attaching data. In addition to the `true`/`false` toggle, these options accept an allow/deny object for fine-grained control.
+
+ | Value | Behavior |
+ |-------|----------|
+ | `true` | Collect this category. Apply the default denylist — values for sensitive key names (e.g. `authorization`, `cookie`, `token`) are replaced with `"[Filtered]"` (see [Default Denylist](#default-denylist)). |
+ | `false` | Do not collect this category at all. |
+ | `{ deny: string[] }` | Collect this category. Apply the default denylist **plus** these additional key names. |
+ | `{ allow: string[] }` | Collect **only** keys matching this list. The default denylist is replaced, but sensitive values **MUST** still be scrubbed regardless. |
+
+ > **Note:** Sensitive key **values** are always scrubbed — they are replaced with `"[Filtered]"` — regardless of how the collection option is configured. The allow/deny lists control which keys are included, not whether scrubbing applies.
+
+ Examples: `cookies`, `httpHeaders`, `queryParams`.
+
+
+
+
+
+
+ ### How Defaults Cascade
+
+ Because `includeUserInfo` acts as the main gate for PII, its value determines the defaults for other PII-heavy options. Explicitly set `collect` options always override these defaults.
+
+ | Option type | Default when `includeUserInfo: true` | Default when `includeUserInfo: false` |
+ |-------------|--------------------------------------|----------------------------------------|
+ | Collection (key-value pairs) | `true` — use default denylist | `true` — use default denylist, plus PII keys denied |
+ | Boolean | `true` — attach | `false` — do not attach |
+
+
+
+
+
+
+
+---
+
+## Default Denylist
+
+For key-value data (HTTP headers, cookies, URL query params), SDKs **MUST** apply a **default denylist** by key name: values for known-sensitive keys are replaced with `"[Filtered]"`; **keys are never scrubbed** by the SDK.
+
+
+
+### Matching Rule
+
+SDKs **MUST** perform a **partial, case-insensitive match** when comparing header names, cookie names, and query parameter names against the denylist. A key is treated as sensitive if any denylist term appears as a substring in the key (e.g. the term `auth` matches the header names `Authorization` and `X-Auth-Token`).
+
+### Base Denylist (Sensitive Data)
+
+The following terms **MUST** be included in the default denylist for headers (and **SHOULD** be used for cookies and query params where applicable). A key is sensitive if it **partially matches**, case-insensitively, any of:
+
+`["auth", "token", "secret", "password", "passwd", "pwd", "key", "jwt", "bearer", "sso", "saml", "csrf", "xsrf", "credentials", "session", "sid", "identity"]`
+
+Values for keys that match **MUST** be replaced with `"[Filtered]"`.
+
+### PII Denylist (when `includeUserInfo` is `false`)
+
+When `includeUserInfo` is `false`, SDKs **MUST** apply the base denylist **and** treat the following as sensitive (in addition to the list above):
+
+- Any data that contains the following: email, user ID, IP address, username, machine name (if applicable)
+- Any header or key containing **`x-forwarded-`** (e.g. `x-forwarded-for`, `x-forwarded-host`) — often carries client IP or host.
+- Any header or key whose name ends with or contains **`-user`** (e.g. `x-user-id`, `remote-user`) — often carries user identifiers.
+
+So the effective denylist when PII is disabled is: base list + `["x-forwarded-", "-user"]` (substring match, case-insensitive).
+
+### Cookies and Cookie Headers
+- SDKs **SHOULD** maintain a default denylist of cookie names using the same matching rule (e.g. `session`, `auth`, `identity`). Values for matching cookie names **MUST** be replaced with `"[Filtered]"`.
+- **When individual cookie key-value pairs cannot be extracted** (e.g. malformed or opaque cookie header), the entire `Cookie` or `Set-Cookie` header value **MUST** be replaced with `"[Filtered]"`. Unfiltered raw cookie header values **MUST NOT** be sent. When in doubt, treat the whole cookie header as sensitive.
+
+### Request Bodies
+
+When request or response bodies are collected (`incomingRequestBody` / `outgoingRequestBody`):
+
+- **Parseable as JSON or form data:** SDKs **MAY** extract key-value pairs and apply the same denylist rules (partial, case-insensitive match) to keys. Values for matching keys **MUST** be replaced with `"[Filtered]"`. This allows selective scrubbing while retaining non-sensitive fields for debugging.
+- **Raw bodies (not parseable as JSON or FormData):** The body **MUST** be removed and **MUST NOT** be attached to the event. "Raw" HTTP bodies (e.g. binary payloads, plain text, or unparseable content) are never sent through automatic instrumentation. When the SDK cannot parse the body into key-value structure, the entire body **MUST** be replaced with `"[Filtered]"`.
+
+No built-in option scrubs **keys**; users who need to hide header or cookie names **MUST** use `beforeSend` (or equivalent).
+
+
+
+---
+
+## Use Cases
+
+The following examples show how `dataCollection` maps to common configurations.
+
+
+
+### Maximum PII (full collection)
+
+When the user enables full PII collection:
+
+- `includeUserInfo: true`
+- `collect`: all collection options `true` (default denylist); `incomingRequestBody`, `outgoingRequestBody`, and `aiAgentMessages` are `true`.
+
+**Result:** Technical context and request/response data (headers, cookies, query params) are collected with the default denylist; request bodies, user identifiers, and AI agent messages are included; sensitive values are still replaced with `"[Filtered]"`.
+
+
+
+
+
+### Granular Debugging
+
+The user wants to include user info and only specific headers for debugging, and does not want to send query params at all:
+
+```typescript
+init({
+ dsn: "...",
+ dataCollection: {
+ includeUserInfo: true,
+ collect: {
+ httpHeaders: { allow: ['x-request-id', 'x-trace-id', 'x-correlation-id'] },
+ queryParams: false,
+ },
+ },
+});
+```
+
+Because `includeUserInfo` is set, `aiAgentMessages` defaults to `true` unless the user explicitly sets `collect: { aiAgentMessages: false }`.
+
+
+
+
+
+### Migration from sendDefaultPii
+
+- **`sendDefaultPii: true`** (legacy) → `dataCollection: { includeUserInfo: true }` and keep `collect` defaults.
+- **`sendDefaultPii: false`** (legacy) → `dataCollection: { includeUserInfo: false }` (or omit; same as default).
+
+SDKs **SHOULD** document this mapping and **MAY** implement `send_default_pii` as a compatibility shim that sets `includeUserInfo`.
+
+
+
+---
+
+## User-Set Data and Scrubbing
+
+
+
+### Data set by the user
+
+When the user **explicitly** sets data on the scope (user, request, response, tags, contexts, etc.) or on a span, log, or other telemetry, that data is **not** gated by `dataCollection`. It **MUST** always be attached to outgoing telemetry. The same applies to data the user provides via `beforeSend` or event processors (e.g. attaching a request object).
+
+### Automatic vs explicit data
+
+SDKs **SHOULD** only replace sensitive values with `"[Filtered]"` when the data is gathered **automatically** through instrumentation. If the user explicitly provides data (e.g. by setting a request object on the scope), the SDK **MUST NOT** modify it; the user is responsible for what they attach.
+
+### beforeSend and event processors
+
+Users can register callbacks (e.g. `beforeSend`, event processors) to remove or redact any data — including keys — before events are sent. This spec does not replace those hooks; they remain the mechanism for custom filtering and key removal.
+
+
+
+---
+
+## Changelog
+
+
diff --git a/develop-docs/sdk/foundations/data-scrubbing.mdx b/develop-docs/sdk/foundations/data-scrubbing.mdx
index 103fa9261c55c..3a88210f945f8 100644
--- a/develop-docs/sdk/foundations/data-scrubbing.mdx
+++ b/develop-docs/sdk/foundations/data-scrubbing.mdx
@@ -3,63 +3,21 @@ title: Data Scrubbing
sidebar_order: 6
---
-Data handling is the standardized context in how we want SDKs help users filter data.
-
-## Sensitive Data
-
-SDKs should not include PII or other sensitive data in the payload by default.
-When building an SDK we can come across some API that can give useful information to debug a problem.
-In the event that API returns data considered PII, we guard that behind a flag called _Send Default PII_.
-This is an option in the SDK called [_send-default-pii_](https://docs.sentry.io/platforms/python/configuration/options/#send-default-pii)
-and is **disabled by default**. That means that data that is naturally sensitive is not sent by default.
+Data handling is the standardized context in how we want SDKs to help users filter data.
-When a user manually sets the data on the scope (user, contexts, tags, data, request, response, etc.), this data should not be gated by the _Send Default PII_ flag and should always be attached to all outgoing telemetry. This also applies to the data that the user manually sets on a span, log, metric and other types of telemetry (directly or, for example, via `BeforeSend`).
+**Data collection and scrubbing:** The canonical spec for what data SDKs collect, default denylists (headers, cookies, query params), request body and cookie scrubbing, user-set data, and `beforeSend` is [Data Collection](/sdk/foundations/client/data-collection/). That spec supersedes the sensitive-data and cookie sections below for SDK behavior. This page retains **Structuring Data** and **Variable Size** and the legacy `send_default_pii` context for reference.
-Certain sensitive data must never be sent through SDK instrumentation, regardless of any configuration:
-
-- HTTP Headers: The keys of known sensitive headers are added, while their values must be replaced with `"[Filtered]"`.
- - The SDK performs a **partial, case-insensitive match** against the following headers to determine if they are sensitive: `["auth", "token", "secret", "password", "passwd", "pwd", "key", "jwt", "bearer", "sso", "saml", "csrf", "xsrf", "credentials"]`
-
-SDKs should only replace sensitive data with `"[Filtered]"` when the data is gathered automatically through instrumentation.
-If a user explicitly provides data (for example, by setting a request object on the scope), the SDK must not modify it.
-
-Some examples of data guarded by `send_default_pii: false`:
-
-- When attaching data of HTTP requests and/or responses to events
- - Request Body: "raw" HTTP bodies (bodies which cannot be parsed as JSON or FormData) are removed
- - HTTP Headers: header values, containing information about the user are replaced with `"[Filtered]"`
-- User-specific information (e.g. the current user ID according to the used web-framework) is not collected and therefore not sent at all.
-- On desktop applications
- - The username logged in the device is not included. This is often a person's name.
- - The machine name is not included, for example `Bruno's laptop`
-- SDKs don't set `{{auto}}` as `user.ip_address`. This instructs the server to keep the connection's IP address.
-- Server SDKs remove the IP address of incoming HTTP requests.
-
-Sentry server is always aware of the connecting IP address and can use it for logging in some platforms. Namely JavaScript and iOS/macOS/tvOS.
-All other platforms require the event to include `user.ip_address={{auto}}` which happens if `sendDefaultPii` is set to true.
-
-Before sending events to Sentry, the SDKs should invokes callbacks. That allows users to remove any sensitive data client-side.
-
-- [`before-send` and `event-processors`](/sdk/foundations/client/#event-pipeline) can be used to register a callback with custom logic to remove sensitive data.
-
-### Cookies
-
-Since `Cookie` and `Set-Cookie` headers can contain a mix of sensitive and non-sensitive data, SDKs should parse the cookie header and filter values on a per-key basis, depending on the SDK setting and the sensitivity of the cookie value.
-In case, the SDK cannot parse each cookie key-value pair, the entire cookie header must be replaced with `"[Filtered]"`. An unfiltered, raw cookie header value must never be sent.
-
-This selective filtering prevents capturing sensitive data while retaining harmless contextual information for debugging.
-For example, a sensitive session cookie's value is replaced with "[Filtered]", but a non-sensitive cookie for the theme preference can be sent as-is.
+## Sensitive Data
-When attached as span attributes, the results should be as follows:
+The normative rules for sensitive data, PII, cookies, request bodies, and user-set data are in [Data Collection](/sdk/foundations/client/data-collection/). The following is kept for context:
-- `http.request.header.cookie.user_session: "[Filtered]"`
-- `http.request.header.cookie.theme: "dark-mode"`
-- `http.request.header.set_cookie.theme: "light-mode"`
-- `http.request.header.cookie: "[Filtered]"` (Used as a fallback if the cookie header cannot be parsed)
+- SDKs should not include PII or other sensitive data in the payload by default. The legacy option [_send-default-pii_](https://docs.sentry.io/platforms/python/configuration/options/#send-default-pii) is **disabled by default**; the replacement is `dataCollection.includeUserInfo` and `dataCollection.collect` (see [Data Collection](/sdk/foundations/client/data-collection/)).
+- Certain sensitive data must never be sent through SDK instrumentation: header/cookie/query values matching the default denylist are replaced with `"[Filtered]"`. User-set data is always attached; only automatically gathered data is scrubbed. Users can use `beforeSend` / event processors to remove or redact any data.
+- For the exact default denylist (partial, case-insensitive match), PII denylist (`x-forwarded-`, `-user`), cookies when unparseable, and raw request bodies, see [Data Collection — Default Denylist](/sdk/foundations/client/data-collection/#default-denylist) and [User-Set Data and Scrubbing](/sdk/foundations/client/data-collection/#user-set-data-scrubbing).
### Application State
From 07f843eb7dd6926d39aa15c7263151a775150266 Mon Sep 17 00:00:00 2001
From: s1gr1d <32902192+s1gr1d@users.noreply.github.com>
Date: Thu, 5 Mar 2026 14:35:47 +0100
Subject: [PATCH 2/7] fix typo
---
develop-docs/sdk/foundations/client/data-collection/index.mdx | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/develop-docs/sdk/foundations/client/data-collection/index.mdx b/develop-docs/sdk/foundations/client/data-collection/index.mdx
index 76026ac4cf77f..6239f96506361 100644
--- a/develop-docs/sdk/foundations/client/data-collection/index.mdx
+++ b/develop-docs/sdk/foundations/client/data-collection/index.mdx
@@ -94,7 +94,7 @@ init({
cookies: true,
httpHeaders: true,
queryParams: true,
- stackTraceVariables: true,
+ stackTraceVariables: false,
incomingRequestBody: false,
outgoingRequestBody: false,
aiAgentMessages: false
@@ -249,7 +249,7 @@ So the effective denylist when PII is disabled is: base list + `["x-forwarded-",
When request or response bodies are collected (`incomingRequestBody` / `outgoingRequestBody`):
- **Parseable as JSON or form data:** SDKs **MAY** extract key-value pairs and apply the same denylist rules (partial, case-insensitive match) to keys. Values for matching keys **MUST** be replaced with `"[Filtered]"`. This allows selective scrubbing while retaining non-sensitive fields for debugging.
-- **Raw bodies (not parseable as JSON or FormData):** The body **MUST** be removed and **MUST NOT** be attached to the event. "Raw" HTTP bodies (e.g. binary payloads, plain text, or unparseable content) are never sent through automatic instrumentation. When the SDK cannot parse the body into key-value structure, the entire body **MUST** be replaced with `"[Filtered]"`.
+- **Raw bodies (not parseable as JSON or FormData):** The body **MUST** be removed and **MUST NOT** be attached to the event. "Raw" HTTP bodies (e.g. binary payloads, plain text, or unparsable content) are never sent through automatic instrumentation. When the SDK cannot parse the body into key-value structure, the entire body **MUST** be replaced with `"[Filtered]"`.
No built-in option scrubs **keys**; users who need to hide header or cookie names **MUST** use `beforeSend` (or equivalent).
From c36a5d8cbd1b5d8def72da661cea58f3c3a733c9 Mon Sep 17 00:00:00 2001
From: s1gr1d <32902192+s1gr1d@users.noreply.github.com>
Date: Fri, 6 Mar 2026 14:57:01 +0100
Subject: [PATCH 3/7] review suggestions
---
.../client/data-collection/index.mdx | 42 ++++++++++---------
1 file changed, 22 insertions(+), 20 deletions(-)
diff --git a/develop-docs/sdk/foundations/client/data-collection/index.mdx b/develop-docs/sdk/foundations/client/data-collection/index.mdx
index 6239f96506361..3febaba7ca0b7 100644
--- a/develop-docs/sdk/foundations/client/data-collection/index.mdx
+++ b/develop-docs/sdk/foundations/client/data-collection/index.mdx
@@ -3,7 +3,7 @@ title: Data Collection
description: Configuration for what data SDKs collect by default — technical context, PII, and sensitive data.
spec_id: sdk/foundations/client/data-collection
spec_version: 1.0.0
-spec_status: candidate
+spec_status: draft
spec_depends_on:
- id: sdk/foundations/client
version: ">=1.0.0"
@@ -34,14 +34,14 @@ Related specs:
Collected data is grouped into three tiers. SDKs **MUST** treat these tiers consistently when applying defaults and user configuration.
-
+
### 1. Technical Context Data
Non-identifying context used for debugging and performance:
- Device and environment context (OS, runtime, non-PII identifiers)
-- Performance and error context (stack traces, breadcrumbs, span metadata)
+- Performance and error context (stack frames, breadcrumbs, span metadata)
- Framework/routing context where it does not contain PII or secrets
This tier is **not** gated by the data collection configuration. SDKs **MAY** collect it by default.
@@ -74,7 +74,7 @@ SDKs **MUST** never send sensitive **values** through automatic instrumentation;
All data-collection options live under a single key: `dataCollection`.
-
+
### Top-Level Shape
@@ -94,10 +94,11 @@ init({
cookies: true,
httpHeaders: true,
queryParams: true,
- stackTraceVariables: false,
incomingRequestBody: false,
outgoingRequestBody: false,
- aiAgentMessages: false
+ aiAgentMessages: true
+ stackFrameVariables: true,
+ frameContextLines: 5,
},
},
});
@@ -113,7 +114,7 @@ For how `includeUserInfo` affects the defaults of collection options, see [How D
## Option Reference
-
+
### `includeUserInfo`
@@ -133,7 +134,7 @@ For how `includeUserInfo` affects the defaults of collection options, see [How D
---
-
+
### `collect` Options
@@ -144,17 +145,18 @@ For how `includeUserInfo` affects the defaults of collection options, see [How D
| `cookies` | Collection | `true` | Include cookie values; keys are filtered by the default denylist or by allow/deny lists. |
| `httpHeaders` | Collection | `true` | Include HTTP header values; keys are filtered by the default denylist or by allow/deny lists. |
| `queryParams` | Collection | `true` | Include URL query parameter values; keys are filtered by the default denylist or by allow/deny lists. |
- | `stackTraceVariables` | Boolean | Inherits `includeUserInfo` | Include local variable values captured within stack traces. |
- | `incomingRequestBody` | Boolean | Inherits `includeUserInfo` | Include full body of the incoming HTTP request. |
- | `outgoingRequestBody` | Boolean | Inherits `includeUserInfo` | Include full body of outgoing HTTP requests. |
- | `aiAgentMessages` | Boolean | Inherits `includeUserInfo` | Include AI agent input and output messages. |
+ | `incomingRequestBody` | Boolean | Default TBD | Include full body of the incoming HTTP request. |
+ | `outgoingRequestBody` | Boolean | Default TBD | Include full body of outgoing HTTP requests. |
+ | `aiAgentMessages` | Boolean | `true` | Include AI agent input and output messages. |
+ | `stackFrameVariables` | Boolean | `true` | Include local variable values captured within stack frames. |
+ | `frameContextLines` | Number
(Boolean, if not otherwise possible) | `5`
(`true`) | (Number of) lines of context to include around stack frames. |
Unlike cookies or headers, some data (e.g. request bodies) has no predictable key structure for the SDK to filter. Data can still be redacted in `beforeSend` or event processors if needed.
-
+
### Option Types
@@ -193,11 +195,11 @@ For how `includeUserInfo` affects the defaults of collection options, see [How D
-
+
### How Defaults Cascade
- Because `includeUserInfo` acts as the main gate for PII, its value determines the defaults for other PII-heavy options. Explicitly set `collect` options always override these defaults.
+ Because `includeUserInfo` acts as the main gate for PII, its value determines the default denylist for the `collect` options. Explicitly set `collect` options always override this default.
| Option type | Default when `includeUserInfo: true` | Default when `includeUserInfo: false` |
|-------------|--------------------------------------|----------------------------------------|
@@ -216,7 +218,7 @@ For how `includeUserInfo` affects the defaults of collection options, see [How D
For key-value data (HTTP headers, cookies, URL query params), SDKs **MUST** apply a **default denylist** by key name: values for known-sensitive keys are replaced with `"[Filtered]"`; **keys are never scrubbed** by the SDK.
-
+
### Matching Rule
@@ -261,7 +263,7 @@ No built-in option scrubs **keys**; users who need to hide header or cookie name
The following examples show how `dataCollection` maps to common configurations.
-
+
### Maximum PII (full collection)
@@ -274,7 +276,7 @@ When the user enables full PII collection:
-
+
### Granular Debugging
@@ -297,7 +299,7 @@ Because `includeUserInfo` is set, `aiAgentMessages` defaults to `true` unless th
-
+
### Migration from sendDefaultPii
@@ -312,7 +314,7 @@ SDKs **SHOULD** document this mapping and **MAY** implement `send_default_pii` a
## User-Set Data and Scrubbing
-
+
### Data set by the user
From 78d08958c85a0f285481a9c06e54221bcccbf989 Mon Sep 17 00:00:00 2001
From: s1gr1d <32902192+s1gr1d@users.noreply.github.com>
Date: Fri, 6 Mar 2026 15:24:47 +0100
Subject: [PATCH 4/7] create technical spec (skill)
---
.../client/data-collection/index.mdx | 332 +++++++++---------
1 file changed, 160 insertions(+), 172 deletions(-)
diff --git a/develop-docs/sdk/foundations/client/data-collection/index.mdx b/develop-docs/sdk/foundations/client/data-collection/index.mdx
index 3febaba7ca0b7..8cc3c55938e61 100644
--- a/develop-docs/sdk/foundations/client/data-collection/index.mdx
+++ b/develop-docs/sdk/foundations/client/data-collection/index.mdx
@@ -30,13 +30,15 @@ Related specs:
---
-## Data Tiers
-
-Collected data is grouped into three tiers. SDKs **MUST** treat these tiers consistently when applying defaults and user configuration.
+## Concepts
-### 1. Technical Context Data
+### Data Tiers
+
+Collected data is grouped into three tiers. SDKs **MUST** treat these tiers consistently when applying defaults and user configuration.
+
+#### 1. Technical Context Data
Non-identifying context used for debugging and performance:
@@ -46,241 +48,255 @@ Non-identifying context used for debugging and performance:
This tier is **not** gated by the data collection configuration. SDKs **MAY** collect it by default.
-### 2. PII Data
+#### 2. PII Data
Personally identifiable or user-linked data:
- User identifiers (user ID, username, email)
- IP address
- Cookies and headers that identify the user or session
-- AI Agent input and output messages
+- AI agent input and output messages
-This tier **MUST** be off by default unless the user opts in via `includeUserInfo` and/or explicit `collect` allowlists. See [Include User Info](#include-user-info), [collect options](#collect-options), and [Default Denylist](#default-denylist).
+This tier **MUST** be off by default unless the user opts in via `includeUserInfo` and/or explicit `collect` allowlists. See [`includeUserInfo`](#include-user-info-behavior), [`collect` options](#collect-option-behavior), and [Default Denylist](#default-denylist).
-### 3. Sensitive Data
+#### 3. Sensitive Data
-Credentials and secrets that must never be sent by default:
+Credentials and secrets that **MUST** never be sent by default:
- Passwords, tokens, API keys, bearer tokens
- Header or cookie values that match known sensitive names (auth, token, secret, password, key, jwt, etc.)
-SDKs **MUST** never send sensitive **values** through automatic instrumentation; keys are included by the SDK while values are replaced with `"[Filtered]"` (see [Default Denylist](#default-denylist)). Users can use `beforeSend` (or equivalent) to remove or redact keys if needed.
+SDKs **MUST** never send sensitive **values** through automatic instrumentation — values are replaced with `"[Filtered]"` while keys are retained (see [Default Denylist](#default-denylist)). Users can use `beforeSend` (or equivalent) to remove or redact keys if needed.
---
-## Configuration Surface
-
-All data-collection options live under a single key: `dataCollection`.
+## Behavior
-### Top-Level Shape
+### Configuration Requirements
-At the top level, users **MAY** specify **`includeUserInfo`** and a **`collect`** record.
+All data-collection options live under a single top-level key: `dataCollection`. SDKs **MUST** support at least `includeUserInfo` and the `collect` object. SDKs **MAY** omit options that do not apply to the platform (e.g. no `outgoingRequestBody` on a browser-only SDK).
-- **`includeUserInfo`** is the primary toggle for Personally Identifiable Information (PII). It controls whether user-identity fields are included in or excluded from automatic collection, and sets the default for other PII-heavy options (such as `aiAgentMessages`).
-- **`collect`** controls which categories of request/response and runtime data are gathered (cookies, headers, body, query params, etc.); see the sections below.
+`dataCollection` accepts two fields:
-Users configure data collection via the init options. An example with the default options:
-
-```typescript
-init({
- dsn: "...",
- dataCollection: {
- includeUserInfo: false,
- collect: {
- cookies: true,
- httpHeaders: true,
- queryParams: true,
- incomingRequestBody: false,
- outgoingRequestBody: false,
- aiAgentMessages: true
- stackFrameVariables: true,
- frameContextLines: 5,
- },
- },
-});
-```
-
-- SDKs **MUST** support at least `includeUserInfo` and the `collect` object. SDKs **MAY** omit options that do not apply to the platform (e.g. no `outgoingRequestBody` on a backend-only SDK).
-
-For how `includeUserInfo` affects the defaults of collection options, see [How Defaults Cascade](#how-defaults-cascade).
+- **`includeUserInfo`** — the primary toggle for Personally Identifiable Information (PII). Controls whether user-identity fields are included in automatic collection, and sets the default for PII-heavy `collect` options (such as HTTP request bodies - TBD). Defaults to `false`.
+- **`collect`** — controls which categories of request/response and runtime data are gathered. See [`collect` Option Behavior](#collect-option-behavior) and [How Defaults Cascade](#how-defaults-cascade).
----
-
-## Option Reference
-
- ### `includeUserInfo`
+### `includeUserInfo` Behavior
- **Type:** Boolean option
+`includeUserInfo` controls whether the SDK automatically attaches user identity fields to events (e.g. `user.id`, `user.email`, `user.username`, `user.ip_address`). This is the primary PII gate: its value also sets the effective default for PII-heavy `collect` options.
- Controls whether the SDK automatically attaches user identity fields to events (e.g. `user.id`, `user.email`, `user.username`, `user.ip_address`). This is the primary PII gate: its value sets the default for all other PII-heavy options, most notably `aiAgentMessages`.
+| Value | Behavior |
+|-------|----------|
+| `true` | Attach all user identity fields captured by automatic instrumentation. Equivalent to the legacy `sendDefaultPii` flag scoped to user data. |
+| `false` | Do not attach user identity fields from automatic instrumentation. |
- | Value | Behavior |
- |-------|----------|
- | `true` | Attach all user identity fields captured by automatic instrumentation. Equivalent to the legacy `sendDefaultPii` flag scoped to user data. |
- | `false` | Do not attach user identity fields from automatic instrumentation. |
-
- - **Default:** `false`.
- - When user data is set **explicitly** on the scope (or equivalent), it is **always** attached regardless of this setting. See [User-Set Data and Scrubbing](#user-set-data-scrubbing).
+When user data is set **explicitly** on the scope (or equivalent), it is **always** attached regardless of this setting. See [User-Set Data and Scrubbing](#user-set-data-and-scrubbing).
----
-
- ### `collect` Options
+### `collect` Option Behavior
- Each key under `collect` maps to a category of automatically collected data. Refer to the [Option Types](#option-types) section below to understand which value type each key accepts.
+Each key under `collect` maps to a category of automatically collected data and uses one of two option types, depending on whether the data is structured as key-value pairs.
- | Key | Option Type | Default | Description |
- |-----|-------------|---------|-------------|
- | `cookies` | Collection | `true` | Include cookie values; keys are filtered by the default denylist or by allow/deny lists. |
- | `httpHeaders` | Collection | `true` | Include HTTP header values; keys are filtered by the default denylist or by allow/deny lists. |
- | `queryParams` | Collection | `true` | Include URL query parameter values; keys are filtered by the default denylist or by allow/deny lists. |
- | `incomingRequestBody` | Boolean | Default TBD | Include full body of the incoming HTTP request. |
- | `outgoingRequestBody` | Boolean | Default TBD | Include full body of outgoing HTTP requests. |
- | `aiAgentMessages` | Boolean | `true` | Include AI agent input and output messages. |
- | `stackFrameVariables` | Boolean | `true` | Include local variable values captured within stack frames. |
- | `frameContextLines` | Number
(Boolean, if not otherwise possible) | `5`
(`true`) | (Number of) lines of context to include around stack frames. |
+**Boolean options** — used where data cannot be meaningfully filtered at the key level. The SDK either collects the entire category or skips it.
-
- Unlike cookies or headers, some data (e.g. request bodies) has no predictable key structure for the SDK to filter. Data can still be redacted in `beforeSend` or event processors if needed.
-
+| Value | Behavior |
+|-------|----------|
+| `true` | Collect and attach this data category. |
+| `false` | Do not collect this data category at all. |
+**Collection options** — used for key-value data (cookies, headers, query params), where the SDK can inspect individual keys and apply filtering rules before attaching.
-
+| Value | Behavior |
+|-------|----------|
+| `true` | Collect this category. Apply the default denylist — values for sensitive key names are replaced with `"[Filtered]"` (see [Default Denylist](#default-denylist)). |
+| `false` | Do not collect this category at all. |
+| `{ deny: string[] }` | Collect this category. Apply the default denylist **plus** these additional key names. |
+| `{ allow: string[] }` | Collect **only** keys in this list. The default denylist is bypassed, but sensitive values **MUST** still be scrubbed regardless. |
- ### Option Types
+> **Note:** Sensitive key **values** are always scrubbed — replaced with `"[Filtered]"` — regardless of collection option configuration. The allow/deny lists control which keys are included, not whether scrubbing applies.
- Each option in `dataCollection.collect` uses one of two distinct value types. Which type an option accepts depends on whether the collected data is structured as named key-value pairs (e.g. cookies, headers) or as an opaque blob (e.g. request bodies).
+
- ---
+
- #### Boolean Options
+### How Defaults Cascade
- Used for categories where data cannot be meaningfully filtered at the key level — the SDK either collects the entire category or skips it entirely.
+`includeUserInfo` determines the effective default for PII-related `collect` options. Explicitly set `collect` options always override this default.
- | Value | Behavior |
- |-------|----------|
- | `true` | Collect and attach this data category. |
- | `false` | Do not collect this data category at all. |
+| Option type | Default when `includeUserInfo: true` | Default when `includeUserInfo: false` |
+|-------------|--------------------------------------|----------------------------------------|
+| Collection (key-value pairs) | `true` — use default denylist | `true` — use default denylist, plus PII keys denied |
- Examples: `incomingRequestBody`, `outgoingRequestBody`, `aiAgentMessages`.
+Non-PII boolean options (e.g. `stackFrameVariables`) are not affected by `includeUserInfo` and always default to their configured value.
- ---
+
- #### Collection Options
+
- Used for categories structured as named key-value pairs, where the SDK can inspect individual keys and apply filtering rules before attaching data. In addition to the `true`/`false` toggle, these options accept an allow/deny object for fine-grained control.
+### Default Denylist
- | Value | Behavior |
- |-------|----------|
- | `true` | Collect this category. Apply the default denylist — values for sensitive key names (e.g. `authorization`, `cookie`, `token`) are replaced with `"[Filtered]"` (see [Default Denylist](#default-denylist)). |
- | `false` | Do not collect this category at all. |
- | `{ deny: string[] }` | Collect this category. Apply the default denylist **plus** these additional key names. |
- | `{ allow: string[] }` | Collect **only** keys matching this list. The default denylist is replaced, but sensitive values **MUST** still be scrubbed regardless. |
+For key-value data (HTTP headers, cookies, URL query params), SDKs **MUST** apply a **default denylist** by key name: values for known-sensitive keys are replaced with `"[Filtered]"`; **keys are never scrubbed** by the SDK.
- > **Note:** Sensitive key **values** are always scrubbed — they are replaced with `"[Filtered]"` — regardless of how the collection option is configured. The allow/deny lists control which keys are included, not whether scrubbing applies.
+#### Matching Rule
- Examples: `cookies`, `httpHeaders`, `queryParams`.
+SDKs **MUST** perform a **partial, case-insensitive match** when comparing key names against the denylist. A key is treated as sensitive if any denylist term appears as a substring in the key name (e.g. the term `auth` matches `Authorization` and `X-Auth-Token`).
-
+#### Base Denylist (Sensitive Data)
+The following terms **MUST** be included in the default denylist for headers, and **SHOULD** be applied to cookies and query params where applicable:
-
+`["auth", "token", "secret", "password", "passwd", "pwd", "key", "jwt", "bearer", "sso", "saml", "csrf", "xsrf", "credentials", "session", "sid", "identity"]`
- ### How Defaults Cascade
+Values for keys that match **MUST** be replaced with `"[Filtered]"`.
- Because `includeUserInfo` acts as the main gate for PII, its value determines the default denylist for the `collect` options. Explicitly set `collect` options always override this default.
+#### PII Denylist (when `includeUserInfo` is `false`)
- | Option type | Default when `includeUserInfo: true` | Default when `includeUserInfo: false` |
- |-------------|--------------------------------------|----------------------------------------|
- | Collection (key-value pairs) | `true` — use default denylist | `true` — use default denylist, plus PII keys denied |
- | Boolean | `true` — attach | `false` — do not attach |
+When `includeUserInfo` is `false`, SDKs **MUST** apply the base denylist **and** additionally treat the following as sensitive:
-
+- Any data that contains email, user ID, IP address, username, or machine name (if applicable)
+- Any key containing **`x-forwarded-`** (e.g. `x-forwarded-for`, `x-forwarded-host`) — often carries client IP or host
+- Any key ending with or containing **`-user`** (e.g. `x-user-id`, `remote-user`) — often carries user identifiers
+Effective denylist when PII is disabled: base list + `["x-forwarded-", "-user"]` (partial match, case-insensitive).
+#### Cookies and Cookie Headers
-
+- SDKs **SHOULD** maintain a default denylist of cookie names using the same matching rule (e.g. `session`, `auth`, `identity`). Values for matching cookie names **MUST** be replaced with `"[Filtered]"`.
+- **When individual cookie key-value pairs cannot be extracted** (e.g. malformed or opaque cookie string), the entire `Cookie` or `Set-Cookie` header value **MUST** be replaced with `"[Filtered]"`. Unfiltered raw cookie header values **MUST NOT** be sent. When in doubt, treat the whole cookie header as sensitive.
----
+#### Request Bodies
-## Default Denylist
+When request or response bodies are collected (`incomingRequestBody` / `outgoingRequestBody`):
-For key-value data (HTTP headers, cookies, URL query params), SDKs **MUST** apply a **default denylist** by key name: values for known-sensitive keys are replaced with `"[Filtered]"`; **keys are never scrubbed** by the SDK.
+- **Parseable as JSON or form data:** SDKs **MAY** extract key-value pairs and apply the same denylist rules to keys. Values for matching keys **MUST** be replaced with `"[Filtered]"`. This allows selective scrubbing while retaining non-sensitive fields for debugging.
+- **Not parseable (raw bodies):** The body **MUST NOT** be attached to the event. When the SDK cannot parse the body into key-value structure, the entire body **MUST** be replaced with `"[Filtered]"`.
-
+No built-in option scrubs **keys**; users who need to hide header or cookie names **MUST** use `beforeSend` (or equivalent).
-### Matching Rule
+
-SDKs **MUST** perform a **partial, case-insensitive match** when comparing header names, cookie names, and query parameter names against the denylist. A key is treated as sensitive if any denylist term appears as a substring in the key (e.g. the term `auth` matches the header names `Authorization` and `X-Auth-Token`).
+
-### Base Denylist (Sensitive Data)
+### User-Set Data and Scrubbing
-The following terms **MUST** be included in the default denylist for headers (and **SHOULD** be used for cookies and query params where applicable). A key is sensitive if it **partially matches**, case-insensitively, any of:
+When the user **explicitly** sets data on the scope (user, request, response, tags, contexts, etc.) or on a span, log, or other telemetry, that data is **not** gated by `dataCollection`. It **MUST** always be attached to outgoing telemetry. The same applies to data the user provides via `beforeSend` or event processors.
-`["auth", "token", "secret", "password", "passwd", "pwd", "key", "jwt", "bearer", "sso", "saml", "csrf", "xsrf", "credentials", "session", "sid", "identity"]`
+SDKs **SHOULD** only replace sensitive values with `"[Filtered]"` when the data is gathered **automatically** through instrumentation. If the user explicitly provides data (e.g. by setting a request object on the scope), the SDK **MUST NOT** modify it; the user is responsible for what they attach.
-Values for keys that match **MUST** be replaced with `"[Filtered]"`.
+Users can register callbacks (e.g. `beforeSend`, event processors) to remove or redact any data — including keys — before events are sent. This spec does not replace those hooks; they remain the mechanism for custom filtering and key removal.
-### PII Denylist (when `includeUserInfo` is `false`)
+
-When `includeUserInfo` is `false`, SDKs **MUST** apply the base denylist **and** treat the following as sensitive (in addition to the list above):
+---
-- Any data that contains the following: email, user ID, IP address, username, machine name (if applicable)
-- Any header or key containing **`x-forwarded-`** (e.g. `x-forwarded-for`, `x-forwarded-host`) — often carries client IP or host.
-- Any header or key whose name ends with or contains **`-user`** (e.g. `x-user-id`, `remote-user`) — often carries user identifiers.
+## Public API
-So the effective denylist when PII is disabled is: base list + `["x-forwarded-", "-user"]` (substring match, case-insensitive).
+The `dataCollection` option is passed to the SDK's init function. All fields are optional; omitting a field uses the default.
-### Cookies and Cookie Headers
-- SDKs **SHOULD** maintain a default denylist of cookie names using the same matching rule (e.g. `session`, `auth`, `identity`). Values for matching cookie names **MUST** be replaced with `"[Filtered]"`.
-- **When individual cookie key-value pairs cannot be extracted** (e.g. malformed or opaque cookie header), the entire `Cookie` or `Set-Cookie` header value **MUST** be replaced with `"[Filtered]"`. Unfiltered raw cookie header values **MUST NOT** be sent. When in doubt, treat the whole cookie header as sensitive.
+```pseudocode
+init({
+ dataCollection: {
+ includeUserInfo: boolean, // default: false
+ collect: {
+ cookies: Collection, // default: true
+ httpHeaders: Collection, // default: true
+ queryParams: Collection, // default: true
+ incomingRequestBody: boolean, // default: TBD
+ outgoingRequestBody: boolean, // default: TBD
+ aiAgentMessages: boolean, // default: true
+ stackFrameVariables: boolean, // default: true
+ frameContextLines: number, // default: 5 (boolean fallback: true)
+ },
+ },
+})
+```
-### Request Bodies
+### `dataCollection.includeUserInfo`
-When request or response bodies are collected (`incomingRequestBody` / `outgoingRequestBody`):
+| Property | Value |
+|----------|-------|
+| Type | Boolean |
+| Default | `false` |
+| Since | 1.0.0 |
+| Description | Primary PII toggle. Enables automatic collection of user identity fields (`user.id`, `user.email`, `user.username`, `user.ip_address`). Also sets the effective default for PII-heavy `collect` options. |
-- **Parseable as JSON or form data:** SDKs **MAY** extract key-value pairs and apply the same denylist rules (partial, case-insensitive match) to keys. Values for matching keys **MUST** be replaced with `"[Filtered]"`. This allows selective scrubbing while retaining non-sensitive fields for debugging.
-- **Raw bodies (not parseable as JSON or FormData):** The body **MUST** be removed and **MUST NOT** be attached to the event. "Raw" HTTP bodies (e.g. binary payloads, plain text, or unparsable content) are never sent through automatic instrumentation. When the SDK cannot parse the body into key-value structure, the entire body **MUST** be replaced with `"[Filtered]"`.
+### `dataCollection.collect` Options
-No built-in option scrubs **keys**; users who need to hide header or cookie names **MUST** use `beforeSend` (or equivalent).
+| Key | Option Type | Default | Since | Description |
+|-----|-------------|---------|-------|-------------|
+| `cookies` | Collection | `true` | 1.0.0 | Include cookie values; keys filtered by the default denylist or by allow/deny lists. |
+| `httpHeaders` | Collection | `true` | 1.0.0 | Include HTTP header values; keys filtered by the default denylist or by allow/deny lists. |
+| `queryParams` | Collection | `true` | 1.0.0 | Include URL query parameter values; keys filtered by the default denylist or by allow/deny lists. |
+| `incomingRequestBody` | Boolean | TBD | 1.0.0 | Include full body of the incoming HTTP request. |
+| `outgoingRequestBody` | Boolean | TBD | 1.0.0 | Include full body of outgoing HTTP requests. |
+| `aiAgentMessages` | Boolean | `true` | 1.0.0 | Include AI agent input and output messages. |
+| `stackFrameVariables` | Boolean | `true` | 1.0.0 | Include local variable values captured within stack frames. |
+| `frameContextLines` | Number (Boolean fallback) | `5` (`true`) | 1.0.0 | Number of lines of context to include around stack frames. |
-
+
+ Unlike cookies or headers, some data (e.g. request bodies) has no predictable key structure for the SDK to filter. Data can still be redacted in `beforeSend` or event processors if needed.
+
---
-## Use Cases
-
-The following examples show how `dataCollection` maps to common configurations.
+## Examples
-
+### Default Configuration
-### Maximum PII (full collection)
+An explicit representation of all defaults (with `includeUserInfo: false`):
-When the user enables full PII collection:
+```typescript
+init({
+ dsn: "...",
+ dataCollection: {
+ includeUserInfo: false,
+ collect: {
+ cookies: true,
+ httpHeaders: true,
+ queryParams: true,
+ incomingRequestBody: false,
+ outgoingRequestBody: false,
+ aiAgentMessages: true,
+ stackFrameVariables: true,
+ frameContextLines: 5,
+ },
+ },
+});
+```
-- `includeUserInfo: true`
-- `collect`: all collection options `true` (default denylist); `incomingRequestBody`, `outgoingRequestBody`, and `aiAgentMessages` are `true`.
+### Maximum PII (Full Collection)
-**Result:** Technical context and request/response data (headers, cookies, query params) are collected with the default denylist; request bodies, user identifiers, and AI agent messages are included; sensitive values are still replaced with `"[Filtered]"`.
+Enable full PII collection, including request bodies and AI messages:
-
+```typescript
+init({
+ dsn: "...",
+ dataCollection: {
+ includeUserInfo: true,
+ collect: {
+ incomingRequestBody: true,
+ outgoingRequestBody: true,
+ },
+ },
+});
+```
-
+**Result:** Technical context and request/response data (headers, cookies, query params) are collected with the default denylist; request bodies, user identifiers, and AI agent messages are included; sensitive values are still replaced with `"[Filtered]"`.
### Granular Debugging
-The user wants to include user info and only specific headers for debugging, and does not want to send query params at all:
+Include user info and only specific headers for debugging; exclude query params entirely:
```typescript
init({
@@ -295,41 +311,13 @@ init({
});
```
-Because `includeUserInfo` is set, `aiAgentMessages` defaults to `true` unless the user explicitly sets `collect: { aiAgentMessages: false }`.
+### Migration from `sendDefaultPii`
-
-
-
-
-### Migration from sendDefaultPii
-
-- **`sendDefaultPii: true`** (legacy) → `dataCollection: { includeUserInfo: true }` and keep `collect` defaults.
-- **`sendDefaultPii: false`** (legacy) → `dataCollection: { includeUserInfo: false }` (or omit; same as default).
+- **`sendDefaultPii: true`** (legacy) → `dataCollection: { includeUserInfo: true }`, keep `collect` defaults
+- **`sendDefaultPii: false`** (legacy) → `dataCollection: { includeUserInfo: false }` (or omit entirely — same as default)
SDKs **SHOULD** document this mapping and **MAY** implement `send_default_pii` as a compatibility shim that sets `includeUserInfo`.
-
-
----
-
-## User-Set Data and Scrubbing
-
-
-
-### Data set by the user
-
-When the user **explicitly** sets data on the scope (user, request, response, tags, contexts, etc.) or on a span, log, or other telemetry, that data is **not** gated by `dataCollection`. It **MUST** always be attached to outgoing telemetry. The same applies to data the user provides via `beforeSend` or event processors (e.g. attaching a request object).
-
-### Automatic vs explicit data
-
-SDKs **SHOULD** only replace sensitive values with `"[Filtered]"` when the data is gathered **automatically** through instrumentation. If the user explicitly provides data (e.g. by setting a request object on the scope), the SDK **MUST NOT** modify it; the user is responsible for what they attach.
-
-### beforeSend and event processors
-
-Users can register callbacks (e.g. `beforeSend`, event processors) to remove or redact any data — including keys — before events are sent. This spec does not replace those hooks; they remain the mechanism for custom filtering and key removal.
-
-
-
---
## Changelog
From 88bcbaeef243128ed78ae8a760930d5f044372fe Mon Sep 17 00:00:00 2001
From: s1gr1d <32902192+s1gr1d@users.noreply.github.com>
Date: Fri, 6 Mar 2026 15:29:23 +0100
Subject: [PATCH 5/7] explain pii info
---
develop-docs/sdk/foundations/client/data-collection/index.mdx | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/develop-docs/sdk/foundations/client/data-collection/index.mdx b/develop-docs/sdk/foundations/client/data-collection/index.mdx
index 8cc3c55938e61..f05135a927dfe 100644
--- a/develop-docs/sdk/foundations/client/data-collection/index.mdx
+++ b/develop-docs/sdk/foundations/client/data-collection/index.mdx
@@ -45,6 +45,7 @@ Non-identifying context used for debugging and performance:
- Device and environment context (OS, runtime, non-PII identifiers)
- Performance and error context (stack frames, breadcrumbs, span metadata)
- Framework/routing context where it does not contain PII or secrets
+- AI agent messages (input, output, metadata)
This tier is **not** gated by the data collection configuration. SDKs **MAY** collect it by default.
@@ -55,7 +56,7 @@ Personally identifiable or user-linked data:
- User identifiers (user ID, username, email)
- IP address
- Cookies and headers that identify the user or session
-- AI agent input and output messages
+- HTTP request data (TBD)
This tier **MUST** be off by default unless the user opts in via `includeUserInfo` and/or explicit `collect` allowlists. See [`includeUserInfo`](#include-user-info-behavior), [`collect` options](#collect-option-behavior), and [Default Denylist](#default-denylist).
@@ -137,6 +138,7 @@ Each key under `collect` maps to a category of automatically collected data and
| Option type | Default when `includeUserInfo: true` | Default when `includeUserInfo: false` |
|-------------|--------------------------------------|----------------------------------------|
| Collection (key-value pairs) | `true` — use default denylist | `true` — use default denylist, plus PII keys denied |
+| PII Boolean (e.g. `incomingRequestBody`) | `true` — attach | `false` — do not attach |
Non-PII boolean options (e.g. `stackFrameVariables`) are not affected by `includeUserInfo` and always default to their configured value.
From 69badc95c3d0c929db4f1b82c45f2d95175d6a7f Mon Sep 17 00:00:00 2001
From: s1gr1d <32902192+s1gr1d@users.noreply.github.com>
Date: Fri, 6 Mar 2026 15:30:44 +0100
Subject: [PATCH 6/7] change ordering
---
.../sdk/foundations/client/data-collection/index.mdx | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/develop-docs/sdk/foundations/client/data-collection/index.mdx b/develop-docs/sdk/foundations/client/data-collection/index.mdx
index f05135a927dfe..9ea409289f68c 100644
--- a/develop-docs/sdk/foundations/client/data-collection/index.mdx
+++ b/develop-docs/sdk/foundations/client/data-collection/index.mdx
@@ -214,10 +214,10 @@ init({
cookies: Collection, // default: true
httpHeaders: Collection, // default: true
queryParams: Collection, // default: true
- incomingRequestBody: boolean, // default: TBD
- outgoingRequestBody: boolean, // default: TBD
aiAgentMessages: boolean, // default: true
stackFrameVariables: boolean, // default: true
+ incomingRequestBody: boolean, // default: TBD
+ outgoingRequestBody: boolean, // default: TBD
frameContextLines: number, // default: 5 (boolean fallback: true)
},
},
@@ -240,10 +240,10 @@ init({
| `cookies` | Collection | `true` | 1.0.0 | Include cookie values; keys filtered by the default denylist or by allow/deny lists. |
| `httpHeaders` | Collection | `true` | 1.0.0 | Include HTTP header values; keys filtered by the default denylist or by allow/deny lists. |
| `queryParams` | Collection | `true` | 1.0.0 | Include URL query parameter values; keys filtered by the default denylist or by allow/deny lists. |
-| `incomingRequestBody` | Boolean | TBD | 1.0.0 | Include full body of the incoming HTTP request. |
-| `outgoingRequestBody` | Boolean | TBD | 1.0.0 | Include full body of outgoing HTTP requests. |
| `aiAgentMessages` | Boolean | `true` | 1.0.0 | Include AI agent input and output messages. |
| `stackFrameVariables` | Boolean | `true` | 1.0.0 | Include local variable values captured within stack frames. |
+| `incomingRequestBody` | Boolean | TBD | 1.0.0 | Include full body of the incoming HTTP request. |
+| `outgoingRequestBody` | Boolean | TBD | 1.0.0 | Include full body of outgoing HTTP requests. |
| `frameContextLines` | Number (Boolean fallback) | `5` (`true`) | 1.0.0 | Number of lines of context to include around stack frames. |
@@ -267,10 +267,10 @@ init({
cookies: true,
httpHeaders: true,
queryParams: true,
- incomingRequestBody: false,
- outgoingRequestBody: false,
aiAgentMessages: true,
stackFrameVariables: true,
+ incomingRequestBody: false,
+ outgoingRequestBody: false,
frameContextLines: 5,
},
},
From ae5020b011602a78cefc7fdf08aa18be02c4d962 Mon Sep 17 00:00:00 2001
From: s1gr1d <32902192+s1gr1d@users.noreply.github.com>
Date: Fri, 6 Mar 2026 15:32:09 +0100
Subject: [PATCH 7/7] add ai agent messages to migration
---
develop-docs/sdk/foundations/client/data-collection/index.mdx | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/develop-docs/sdk/foundations/client/data-collection/index.mdx b/develop-docs/sdk/foundations/client/data-collection/index.mdx
index 9ea409289f68c..8f158e326c46e 100644
--- a/develop-docs/sdk/foundations/client/data-collection/index.mdx
+++ b/develop-docs/sdk/foundations/client/data-collection/index.mdx
@@ -315,7 +315,7 @@ init({
### Migration from `sendDefaultPii`
-- **`sendDefaultPii: true`** (legacy) → `dataCollection: { includeUserInfo: true }`, keep `collect` defaults
+- **`sendDefaultPii: true`** (legacy) → `dataCollection: { includeUserInfo: true, collect: { aiAgentMessages: false } }`, keep most `collect` defaults
- **`sendDefaultPii: false`** (legacy) → `dataCollection: { includeUserInfo: false }` (or omit entirely — same as default)
SDKs **SHOULD** document this mapping and **MAY** implement `send_default_pii` as a compatibility shim that sets `includeUserInfo`.