diff --git a/develop-docs/sdk/foundations/client/data-collection/index.mdx b/develop-docs/sdk/foundations/client/data-collection/index.mdx
new file mode 100644
index 0000000000000..8f158e326c46e
--- /dev/null
+++ b/develop-docs/sdk/foundations/client/data-collection/index.mdx
@@ -0,0 +1,327 @@
+---
+title: Data Collection
+description: Configuration for what data SDKs collect by default — technical context, PII, and sensitive data.
+spec_id: sdk/foundations/client/data-collection
+spec_version: 1.0.0
+spec_status: draft
+spec_depends_on:
+ - id: sdk/foundations/client
+ version: ">=1.0.0"
+spec_changelog:
+ - version: 1.0.0
+ date: 2025-03-05
+ summary: Initial spec; dataCollection config, three data tiers, cookies/headers denylist, replace sendDefaultPii.
+sidebar_order: 1
+---
+
+
+
+
+
+## Overview
+
+This spec defines how SDKs control **what data is collected automatically** from the runtime (device, requests, responses, user context). It replaces the single `sendDefaultPii` (or platform-equivalent) flag with a structured `dataCollection` configuration so users can enable or restrict collection by category and by field.
+
+Related specs:
+
+- [Data Handling](/sdk/expected-features/data-handling/) — structuring data for scrubbing (spans, breadcrumbs), variable size limits
+- [Client](/sdk/foundations/client/) — client lifecycle and event pipeline
+- [Configuration](/sdk/foundations/client/configuration/) — top-level init options including `send_default_pii` (deprecated in favor of this spec)
+
+---
+
+## Concepts
+
+
+
+### Data Tiers
+
+Collected data is grouped into three tiers. SDKs **MUST** treat these tiers consistently when applying defaults and user configuration.
+
+#### 1. Technical Context Data
+
+Non-identifying context used for debugging and performance:
+
+- Device and environment context (OS, runtime, non-PII identifiers)
+- Performance and error context (stack frames, breadcrumbs, span metadata)
+- Framework/routing context where it does not contain PII or secrets
+- AI agent messages (input, output, metadata)
+
+This tier is **not** gated by the data collection configuration. SDKs **MAY** collect it by default.
+
+#### 2. PII Data
+
+Personally identifiable or user-linked data:
+
+- User identifiers (user ID, username, email)
+- IP address
+- Cookies and headers that identify the user or session
+- HTTP request data (TBD)
+
+This tier **MUST** be off by default unless the user opts in via `includeUserInfo` and/or explicit `collect` allowlists. See [`includeUserInfo`](#include-user-info-behavior), [`collect` options](#collect-option-behavior), and [Default Denylist](#default-denylist).
+
+#### 3. Sensitive Data
+
+Credentials and secrets that **MUST** never be sent by default:
+
+- Passwords, tokens, API keys, bearer tokens
+- Header or cookie values that match known sensitive names (auth, token, secret, password, key, jwt, etc.)
+
+SDKs **MUST** never send sensitive **values** through automatic instrumentation — values are replaced with `"[Filtered]"` while keys are retained (see [Default Denylist](#default-denylist)). Users can use `beforeSend` (or equivalent) to remove or redact keys if needed.
+
+
+
+---
+
+## Behavior
+
+
+
+### Configuration Requirements
+
+All data-collection options live under a single top-level key: `dataCollection`. SDKs **MUST** support at least `includeUserInfo` and the `collect` object. SDKs **MAY** omit options that do not apply to the platform (e.g. no `outgoingRequestBody` on a browser-only SDK).
+
+`dataCollection` accepts two fields:
+
+- **`includeUserInfo`** — the primary toggle for Personally Identifiable Information (PII). Controls whether user-identity fields are included in automatic collection, and sets the default for PII-heavy `collect` options (such as HTTP request bodies - TBD). Defaults to `false`.
+- **`collect`** — controls which categories of request/response and runtime data are gathered. See [`collect` Option Behavior](#collect-option-behavior) and [How Defaults Cascade](#how-defaults-cascade).
+
+
+
+
+
+### `includeUserInfo` Behavior
+
+`includeUserInfo` controls whether the SDK automatically attaches user identity fields to events (e.g. `user.id`, `user.email`, `user.username`, `user.ip_address`). This is the primary PII gate: its value also sets the effective default for PII-heavy `collect` options.
+
+| Value | Behavior |
+|-------|----------|
+| `true` | Attach all user identity fields captured by automatic instrumentation. Equivalent to the legacy `sendDefaultPii` flag scoped to user data. |
+| `false` | Do not attach user identity fields from automatic instrumentation. |
+
+When user data is set **explicitly** on the scope (or equivalent), it is **always** attached regardless of this setting. See [User-Set Data and Scrubbing](#user-set-data-and-scrubbing).
+
+
+
+
+
+### `collect` Option Behavior
+
+Each key under `collect` maps to a category of automatically collected data and uses one of two option types, depending on whether the data is structured as key-value pairs.
+
+**Boolean options** — used where data cannot be meaningfully filtered at the key level. The SDK either collects the entire category or skips it.
+
+| Value | Behavior |
+|-------|----------|
+| `true` | Collect and attach this data category. |
+| `false` | Do not collect this data category at all. |
+
+**Collection options** — used for key-value data (cookies, headers, query params), where the SDK can inspect individual keys and apply filtering rules before attaching.
+
+| Value | Behavior |
+|-------|----------|
+| `true` | Collect this category. Apply the default denylist — values for sensitive key names are replaced with `"[Filtered]"` (see [Default Denylist](#default-denylist)). |
+| `false` | Do not collect this category at all. |
+| `{ deny: string[] }` | Collect this category. Apply the default denylist **plus** these additional key names. |
+| `{ allow: string[] }` | Collect **only** keys in this list. The default denylist is bypassed, but sensitive values **MUST** still be scrubbed regardless. |
+
+> **Note:** Sensitive key **values** are always scrubbed — replaced with `"[Filtered]"` — regardless of collection option configuration. The allow/deny lists control which keys are included, not whether scrubbing applies.
+
+
+
+
+
+### How Defaults Cascade
+
+`includeUserInfo` determines the effective default for PII-related `collect` options. Explicitly set `collect` options always override this default.
+
+| Option type | Default when `includeUserInfo: true` | Default when `includeUserInfo: false` |
+|-------------|--------------------------------------|----------------------------------------|
+| Collection (key-value pairs) | `true` — use default denylist | `true` — use default denylist, plus PII keys denied |
+| PII Boolean (e.g. `incomingRequestBody`) | `true` — attach | `false` — do not attach |
+
+Non-PII boolean options (e.g. `stackFrameVariables`) are not affected by `includeUserInfo` and always default to their configured value.
+
+
+
+
+
+### Default Denylist
+
+For key-value data (HTTP headers, cookies, URL query params), SDKs **MUST** apply a **default denylist** by key name: values for known-sensitive keys are replaced with `"[Filtered]"`; **keys are never scrubbed** by the SDK.
+
+#### Matching Rule
+
+SDKs **MUST** perform a **partial, case-insensitive match** when comparing key names against the denylist. A key is treated as sensitive if any denylist term appears as a substring in the key name (e.g. the term `auth` matches `Authorization` and `X-Auth-Token`).
+
+#### Base Denylist (Sensitive Data)
+
+The following terms **MUST** be included in the default denylist for headers, and **SHOULD** be applied to cookies and query params where applicable:
+
+`["auth", "token", "secret", "password", "passwd", "pwd", "key", "jwt", "bearer", "sso", "saml", "csrf", "xsrf", "credentials", "session", "sid", "identity"]`
+
+Values for keys that match **MUST** be replaced with `"[Filtered]"`.
+
+#### PII Denylist (when `includeUserInfo` is `false`)
+
+When `includeUserInfo` is `false`, SDKs **MUST** apply the base denylist **and** additionally treat the following as sensitive:
+
+- Any data that contains email, user ID, IP address, username, or machine name (if applicable)
+- Any key containing **`x-forwarded-`** (e.g. `x-forwarded-for`, `x-forwarded-host`) — often carries client IP or host
+- Any key ending with or containing **`-user`** (e.g. `x-user-id`, `remote-user`) — often carries user identifiers
+
+Effective denylist when PII is disabled: base list + `["x-forwarded-", "-user"]` (partial match, case-insensitive).
+
+#### Cookies and Cookie Headers
+
+- SDKs **SHOULD** maintain a default denylist of cookie names using the same matching rule (e.g. `session`, `auth`, `identity`). Values for matching cookie names **MUST** be replaced with `"[Filtered]"`.
+- **When individual cookie key-value pairs cannot be extracted** (e.g. malformed or opaque cookie string), the entire `Cookie` or `Set-Cookie` header value **MUST** be replaced with `"[Filtered]"`. Unfiltered raw cookie header values **MUST NOT** be sent. When in doubt, treat the whole cookie header as sensitive.
+
+#### Request Bodies
+
+When request or response bodies are collected (`incomingRequestBody` / `outgoingRequestBody`):
+
+- **Parseable as JSON or form data:** SDKs **MAY** extract key-value pairs and apply the same denylist rules to keys. Values for matching keys **MUST** be replaced with `"[Filtered]"`. This allows selective scrubbing while retaining non-sensitive fields for debugging.
+- **Not parseable (raw bodies):** The body **MUST NOT** be attached to the event. When the SDK cannot parse the body into key-value structure, the entire body **MUST** be replaced with `"[Filtered]"`.
+
+No built-in option scrubs **keys**; users who need to hide header or cookie names **MUST** use `beforeSend` (or equivalent).
+
+
+
+
+
+### User-Set Data and Scrubbing
+
+When the user **explicitly** sets data on the scope (user, request, response, tags, contexts, etc.) or on a span, log, or other telemetry, that data is **not** gated by `dataCollection`. It **MUST** always be attached to outgoing telemetry. The same applies to data the user provides via `beforeSend` or event processors.
+
+SDKs **SHOULD** only replace sensitive values with `"[Filtered]"` when the data is gathered **automatically** through instrumentation. If the user explicitly provides data (e.g. by setting a request object on the scope), the SDK **MUST NOT** modify it; the user is responsible for what they attach.
+
+Users can register callbacks (e.g. `beforeSend`, event processors) to remove or redact any data — including keys — before events are sent. This spec does not replace those hooks; they remain the mechanism for custom filtering and key removal.
+
+
+
+---
+
+## Public API
+
+The `dataCollection` option is passed to the SDK's init function. All fields are optional; omitting a field uses the default.
+
+```pseudocode
+init({
+ dataCollection: {
+ includeUserInfo: boolean, // default: false
+ collect: {
+ cookies: Collection, // default: true
+ httpHeaders: Collection, // default: true
+ queryParams: Collection, // default: true
+ aiAgentMessages: boolean, // default: true
+ stackFrameVariables: boolean, // default: true
+ incomingRequestBody: boolean, // default: TBD
+ outgoingRequestBody: boolean, // default: TBD
+ frameContextLines: number, // default: 5 (boolean fallback: true)
+ },
+ },
+})
+```
+
+### `dataCollection.includeUserInfo`
+
+| Property | Value |
+|----------|-------|
+| Type | Boolean |
+| Default | `false` |
+| Since | 1.0.0 |
+| Description | Primary PII toggle. Enables automatic collection of user identity fields (`user.id`, `user.email`, `user.username`, `user.ip_address`). Also sets the effective default for PII-heavy `collect` options. |
+
+### `dataCollection.collect` Options
+
+| Key | Option Type | Default | Since | Description |
+|-----|-------------|---------|-------|-------------|
+| `cookies` | Collection | `true` | 1.0.0 | Include cookie values; keys filtered by the default denylist or by allow/deny lists. |
+| `httpHeaders` | Collection | `true` | 1.0.0 | Include HTTP header values; keys filtered by the default denylist or by allow/deny lists. |
+| `queryParams` | Collection | `true` | 1.0.0 | Include URL query parameter values; keys filtered by the default denylist or by allow/deny lists. |
+| `aiAgentMessages` | Boolean | `true` | 1.0.0 | Include AI agent input and output messages. |
+| `stackFrameVariables` | Boolean | `true` | 1.0.0 | Include local variable values captured within stack frames. |
+| `incomingRequestBody` | Boolean | TBD | 1.0.0 | Include full body of the incoming HTTP request. |
+| `outgoingRequestBody` | Boolean | TBD | 1.0.0 | Include full body of outgoing HTTP requests. |
+| `frameContextLines` | Number (Boolean fallback) | `5` (`true`) | 1.0.0 | Number of lines of context to include around stack frames. |
+
+
+ Unlike cookies or headers, some data (e.g. request bodies) has no predictable key structure for the SDK to filter. Data can still be redacted in `beforeSend` or event processors if needed.
+
+
+---
+
+## Examples
+
+### Default Configuration
+
+An explicit representation of all defaults (with `includeUserInfo: false`):
+
+```typescript
+init({
+ dsn: "...",
+ dataCollection: {
+ includeUserInfo: false,
+ collect: {
+ cookies: true,
+ httpHeaders: true,
+ queryParams: true,
+ aiAgentMessages: true,
+ stackFrameVariables: true,
+ incomingRequestBody: false,
+ outgoingRequestBody: false,
+ frameContextLines: 5,
+ },
+ },
+});
+```
+
+### Maximum PII (Full Collection)
+
+Enable full PII collection, including request bodies and AI messages:
+
+```typescript
+init({
+ dsn: "...",
+ dataCollection: {
+ includeUserInfo: true,
+ collect: {
+ incomingRequestBody: true,
+ outgoingRequestBody: true,
+ },
+ },
+});
+```
+
+**Result:** Technical context and request/response data (headers, cookies, query params) are collected with the default denylist; request bodies, user identifiers, and AI agent messages are included; sensitive values are still replaced with `"[Filtered]"`.
+
+### Granular Debugging
+
+Include user info and only specific headers for debugging; exclude query params entirely:
+
+```typescript
+init({
+ dsn: "...",
+ dataCollection: {
+ includeUserInfo: true,
+ collect: {
+ httpHeaders: { allow: ['x-request-id', 'x-trace-id', 'x-correlation-id'] },
+ queryParams: false,
+ },
+ },
+});
+```
+
+### Migration from `sendDefaultPii`
+
+- **`sendDefaultPii: true`** (legacy) → `dataCollection: { includeUserInfo: true, collect: { aiAgentMessages: false } }`, keep most `collect` defaults
+- **`sendDefaultPii: false`** (legacy) → `dataCollection: { includeUserInfo: false }` (or omit entirely — same as default)
+
+SDKs **SHOULD** document this mapping and **MAY** implement `send_default_pii` as a compatibility shim that sets `includeUserInfo`.
+
+---
+
+## Changelog
+
+
diff --git a/develop-docs/sdk/foundations/data-scrubbing.mdx b/develop-docs/sdk/foundations/data-scrubbing.mdx
index 103fa9261c55c..3a88210f945f8 100644
--- a/develop-docs/sdk/foundations/data-scrubbing.mdx
+++ b/develop-docs/sdk/foundations/data-scrubbing.mdx
@@ -3,63 +3,21 @@ title: Data Scrubbing
sidebar_order: 6
---
-Data handling is the standardized context in how we want SDKs help users filter data.
-
-## Sensitive Data
-
-SDKs should not include PII or other sensitive data in the payload by default.
-When building an SDK we can come across some API that can give useful information to debug a problem.
-In the event that API returns data considered PII, we guard that behind a flag called _Send Default PII_.
-This is an option in the SDK called [_send-default-pii_](https://docs.sentry.io/platforms/python/configuration/options/#send-default-pii)
-and is **disabled by default**. That means that data that is naturally sensitive is not sent by default.
+Data handling is the standardized context in how we want SDKs to help users filter data.
-When a user manually sets the data on the scope (user, contexts, tags, data, request, response, etc.), this data should not be gated by the _Send Default PII_ flag and should always be attached to all outgoing telemetry. This also applies to the data that the user manually sets on a span, log, metric and other types of telemetry (directly or, for example, via `BeforeSend`).
+**Data collection and scrubbing:** The canonical spec for what data SDKs collect, default denylists (headers, cookies, query params), request body and cookie scrubbing, user-set data, and `beforeSend` is [Data Collection](/sdk/foundations/client/data-collection/). That spec supersedes the sensitive-data and cookie sections below for SDK behavior. This page retains **Structuring Data** and **Variable Size** and the legacy `send_default_pii` context for reference.
-Certain sensitive data must never be sent through SDK instrumentation, regardless of any configuration:
-
-- HTTP Headers: The keys of known sensitive headers are added, while their values must be replaced with `"[Filtered]"`.
- - The SDK performs a **partial, case-insensitive match** against the following headers to determine if they are sensitive: `["auth", "token", "secret", "password", "passwd", "pwd", "key", "jwt", "bearer", "sso", "saml", "csrf", "xsrf", "credentials"]`
-
-SDKs should only replace sensitive data with `"[Filtered]"` when the data is gathered automatically through instrumentation.
-If a user explicitly provides data (for example, by setting a request object on the scope), the SDK must not modify it.
-
-Some examples of data guarded by `send_default_pii: false`:
-
-- When attaching data of HTTP requests and/or responses to events
- - Request Body: "raw" HTTP bodies (bodies which cannot be parsed as JSON or FormData) are removed
- - HTTP Headers: header values, containing information about the user are replaced with `"[Filtered]"`
-- User-specific information (e.g. the current user ID according to the used web-framework) is not collected and therefore not sent at all.
-- On desktop applications
- - The username logged in the device is not included. This is often a person's name.
- - The machine name is not included, for example `Bruno's laptop`
-- SDKs don't set `{{auto}}` as `user.ip_address`. This instructs the server to keep the connection's IP address.
-- Server SDKs remove the IP address of incoming HTTP requests.
-
-Sentry server is always aware of the connecting IP address and can use it for logging in some platforms. Namely JavaScript and iOS/macOS/tvOS.
-All other platforms require the event to include `user.ip_address={{auto}}` which happens if `sendDefaultPii` is set to true.
-
-Before sending events to Sentry, the SDKs should invokes callbacks. That allows users to remove any sensitive data client-side.
-
-- [`before-send` and `event-processors`](/sdk/foundations/client/#event-pipeline) can be used to register a callback with custom logic to remove sensitive data.
-
-### Cookies
-
-Since `Cookie` and `Set-Cookie` headers can contain a mix of sensitive and non-sensitive data, SDKs should parse the cookie header and filter values on a per-key basis, depending on the SDK setting and the sensitivity of the cookie value.
-In case, the SDK cannot parse each cookie key-value pair, the entire cookie header must be replaced with `"[Filtered]"`. An unfiltered, raw cookie header value must never be sent.
-
-This selective filtering prevents capturing sensitive data while retaining harmless contextual information for debugging.
-For example, a sensitive session cookie's value is replaced with "[Filtered]", but a non-sensitive cookie for the theme preference can be sent as-is.
+## Sensitive Data
-When attached as span attributes, the results should be as follows:
+The normative rules for sensitive data, PII, cookies, request bodies, and user-set data are in [Data Collection](/sdk/foundations/client/data-collection/). The following is kept for context:
-- `http.request.header.cookie.user_session: "[Filtered]"`
-- `http.request.header.cookie.theme: "dark-mode"`
-- `http.request.header.set_cookie.theme: "light-mode"`
-- `http.request.header.cookie: "[Filtered]"` (Used as a fallback if the cookie header cannot be parsed)
+- SDKs should not include PII or other sensitive data in the payload by default. The legacy option [_send-default-pii_](https://docs.sentry.io/platforms/python/configuration/options/#send-default-pii) is **disabled by default**; the replacement is `dataCollection.includeUserInfo` and `dataCollection.collect` (see [Data Collection](/sdk/foundations/client/data-collection/)).
+- Certain sensitive data must never be sent through SDK instrumentation: header/cookie/query values matching the default denylist are replaced with `"[Filtered]"`. User-set data is always attached; only automatically gathered data is scrubbed. Users can use `beforeSend` / event processors to remove or redact any data.
+- For the exact default denylist (partial, case-insensitive match), PII denylist (`x-forwarded-`, `-user`), cookies when unparseable, and raw request bodies, see [Data Collection — Default Denylist](/sdk/foundations/client/data-collection/#default-denylist) and [User-Set Data and Scrubbing](/sdk/foundations/client/data-collection/#user-set-data-scrubbing).
### Application State