Skip to content

feat(observability): solution attribution via native AWS_SDK_UA_APP_ID (#319, alt to #338)#345

Draft
scottschreckengaust wants to merge 8 commits into
mainfrom
feat/319-sdk-user-agent-appid
Draft

feat(observability): solution attribution via native AWS_SDK_UA_APP_ID (#319, alt to #338)#345
scottschreckengaust wants to merge 8 commits into
mainfrom
feat/319-sdk-user-agent-appid

Conversation

@scottschreckengaust

@scottschreckengaust scottschreckengaust commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Draft — feature-complete, in review. The simplified alternative to #338 for issue #319. Both PRs are intentionally open in draft so reviewers can compare the two approaches side by side and pick one. Do not merge yet.

Closes #319 (one of two candidate implementations — see #338 for the other)

The two options, side by side

#338 (raw-path) this PR #345 (native app-id)
app/uksb-wt64nei4u6#{stack} hand-built via raw customUserAgent/user_agent_extra because / isn't a legal app-id char SDK-native AWS_SDK_UA_APP_ID env — # separator is legal, so zero client code
md/uksb-wt64nei4u6#{component} static, baked once static, baked once (same)
#{TRACE} per-request before-send handler + JS middleware + module trace state + ~60 withAbcaTrace() wraps dropped — correlation owned by X-Ray / structured-log request ids (#245)
{STACKNAME} sanitize-then-clip-34 custom routine none — CFN names are [A-Za-z0-9-], already a subset of the app-id charset
Customer opt-out n/a -c sdkUaAppId='' (deploy) or AWS_SDK_UA_APP_ID='' (CLI) — built-in

The key unlock (confirmed against installed botocore 1.43.9 + JS v3): using # instead of / as the app/ separator keeps the segment on the SDK's native app-id field, which both SDKs read from AWS_SDK_UA_APP_ID automatically. That removes the only reason #338 needed the raw user-agent path and the entire trace plane.

What's implemented (all complete)

  • Helpersagent/src/ua.py, cdk/src/handlers/shared/ua.ts, cli/src/ua.ts: each owns only the static md/#{component} segment + a defensive sanitizeUaValue(). Wire-capture tests on all three assert both segments land on a real outbound User-Agent header (and app/ is omitted when AWS_SDK_UA_APP_ID is unset/empty).
  • Agent — session-level user_agent_extra (scoped + plain sessions) covers every tenant_client/tenant_resource; new platform_client() carries the md/ segment for the 8 direct boto3.client sites. No before-send appender.
  • JS handlers...abcaUserAgent() spread into all 60 SDK client sites across 43 files. No middleware.
  • CLIapplyDefaultAppId() at startup + ...abcaUserAgent() on all 18 AWS SDK client sites; auth.test.ts asserts the Cognito client receives the md/ pair.
  • CDK env threading — stack-level SolutionUaAspect sets AWS_SDK_UA_APP_ID on every Lambda (current + future, structurally); AgentCore runtime + ECS container set it explicitly; per-surface ABCA_COMPONENT (api/orchestr/webhook, integrations via a per-construct ComponentUaAspect). Template assertions added.
  • DocsAGENTS.md "Common mistakes" bullet for future SDK clients.

Verification

  • agent //agent:quality green: 1074 tests, 78.98% coverage, ruff + ty clean.
  • cdk //cdk:test green: 2061 tests (114 suites), compile clean, eslint 0 errors.
  • cli //cli:build green: 365+ tests, eslint clean.
  • docs //docs:build green; no Starlight mutation.
  • Local cdk synth fails only on the pre-existing ec2:DescribeAvailabilityZones cred gap on this workstation (same as main and feat(observability): inject solution into outbound AWS SDK User-Agent (#319) #338) — CI's build job runs the real synth.

🤖 Generated with Claude Code

…319)

Alternative to PR #338: emit the app/ segment via the SDK-native
AWS_SDK_UA_APP_ID env var (botocore + JS v3 read it automatically) using
'#' instead of '/' as the separator, so no raw-user-agent-path machinery
is needed. Each ua module owns only the static md/#{component} segment;
there is no per-request {TRACE} handle, no before-send/middleware, and no
module trace state — request correlation stays with X-Ray / structured
logs (#245), and connection pools are never re-pinned.

CloudFormation stack names are [A-Za-z0-9-], a subset of both the
UA-token and app-id charsets, so {STACKNAME} needs no sanitization; the
only sanitize() left is defensive cover on the component label.

This commit adds the three mirrored helpers + tests (agent/src/ua.py,
cdk/src/handlers/shared/ua.ts, cli/src/ua.ts). Each has a wire-capture
test asserting both app/ and md/ segments land on a real outbound
User-Agent header (and that app/ is omitted when AWS_SDK_UA_APP_ID is
unset/empty — the customer opt-out). 12 agent + 12 cdk + 10 cli tests,
100% coverage on the new modules. Wiring the client sites + CDK env
threading follow in subsequent commits.

Part of #319

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
scottschreckengaust and others added 7 commits June 13, 2026 18:36
…sites (#319)

Session-level user_agent_extra on both the scoped (refreshable) and the
plain ambient session covers every tenant_client/tenant_resource caller.
New platform_client() carries the static md/ segment (merged into any
caller Config) for the 8 direct boto3.client sites that bypass the
session by design — logs x5 (shell, server x2, telemetry x2), secrets
manager x2 (config), bedrock-agentcore x1 (memory) — plus the ambient
STS client used for role chaining.

No per-request trace handle and no before-send appender: the md/ segment
is fully static, so cached clients and the singleton session pool are
never re-pinned. The app/ segment is contributed separately by the SDK
from AWS_SDK_UA_APP_ID (threaded by CDK, next commit).

4 new aws_session tests assert the md/ segment rides platform_client,
the unscoped tenant_client, a merged caller Config, and the scoped
session. Full agent suite green (1070 tests).

Part of #319

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Spread ...abcaUserAgent() into all 60 SDK v3 client constructors across
43 handler files (DynamoDB/Secrets Manager/Lambda/Bedrock/ECS/
BedrockAgentCore). DocumentClient sites instrument the inner
DynamoDBClient (shared middleware stack). No withAbcaTrace/middleware —
the md/ segment is fully static, so module-level cached clients keep
their connection pools; the app/ segment rides native AWS_SDK_UA_APP_ID
(threaded next commit).

No behavior change beyond the UA header: all 2051 existing CDK tests
pass unmodified (the new spread arg merges into the constructor config
the tests already assert on / mock).

Part of #319

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
)

applyDefaultAppId() at startup defaults AWS_SDK_UA_APP_ID to the
solution id (only when unset — an explicit '' opts out) so the CLI's
own SDK calls carry the app/ segment with no per-site code. Spread
...abcaUserAgent() into all 18 AWS SDK v3 client sites (Cognito x3,
Secrets Manager, CloudFormation, DynamoDB) across auth/admin/github/
slack/linear; the bgagent REST ApiClient is not an AWS SDK client and
is untouched.

auth.test.ts asserts the Cognito client constructor receives the md/
customUserAgent pair. Full CLI suite green (365 + new tests).

Part of #319

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…319)

The `app/` segment is now SDK-native: a stack-level SolutionUaAspect sets
AWS_SDK_UA_APP_ID=uksb-wt64nei4u6#{stackName} on every Lambda (current and
future, structurally), and the AgentCore runtime + ECS container set the
same value explicitly (the Lambda-only aspect can't reach them). botocore
and JS v3 both read AWS_SDK_UA_APP_ID natively, so no client code builds
the app/ segment. `-c sdkUaAppId=''` opts the whole stack out; any other
`-c sdkUaAppId=` value overrides.

The `md/#{component}` label is per-surface ABCA_COMPONENT: 'api'
(task-api commonEnv), 'orchestr' (orchestrator/reconcilers/cleanup/fanout),
'webhook' (slack/linear/github-screenshot integrations, via a per-construct
ComponentUaAspect so every function in the integration — including future
ones — is labeled without editing each env block).

buildAppId() centralizes the value: defaults to uksb-wt64nei4u6#{stack},
sanitizes a non-CFN override, clips to the documented 50-char cap, and
returns undefined for the empty-string opt-out. CloudFormation stack names
are [A-Za-z0-9-] (already app-id-safe), so no stack-name sanitization is
needed in the default path.

New tests: solution-ua-aspect.test.ts (buildAppId vectors + both aspects);
task-api/orchestrator template assertions for the component label. Full
CDK suite green (2061 tests). Local synth fails only on the pre-existing
ec2:DescribeAvailabilityZones cred gap (CI runs the real synth).

Part of #319

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
New "Common mistakes" bullet directing agent/handler/CLI code to the
per-surface ua helpers and explaining the app/ (SDK-native via
AWS_SDK_UA_APP_ID) vs md/ (explicit per-surface label) split, plus the
customer opt-out. Root-level file — no Starlight sync needed.

Part of #319

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@scottschreckengaust

scottschreckengaust commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

✅ Tested & verified — outbound User-Agent on the wire (both runtime tiers)

Validated that this PR's solution-attribution segments land on the actual outbound User-Agent of live AWS SDK calls, in both runtime tiers, using the PR's own helpers (no reimplementation). Verified 2026-06-19 against botocore 1.43.29 + aws-sdk-js 3.1068.0 (Python 3.13 / Node 22), profile with real AWS creds.

What was checked

Two standalone wire-capture scripts import the real helpers and print the assembled User-Agent of live calls. Each attaches a capture hook that fires after the SDK assembles the UA but before the response (so the header prints even if the call is denied), and exercises a generic client plus DynamoDB, S3, and Secrets Manager:

Tier Real helper imported md/ component
Lambda handlers (Node 24) abcaUserAgent()cdk/src/handlers/shared/ua.ts ABCA_COMPONENT env (api/orchestr/webhook)
Agent runtime (Python 3.13 / boto3) client_config()agent/src/ua.py hard-wired agent

Result — every call, both tiers (contains md/uksb-wt64nei4u6#... ? YES)

# Node (ABCA_COMPONENT=orchestr), Lambda GetAccountSettings / DynamoDB / S3 / SecretsManager:
User-Agent: aws-sdk-js/3.1068.0 ... app/uksb-wt64nei4u6#integ-1910531 md/uksb-wt64nei4u6#orchestr
            (md/ segment also present on the x-amz-user-agent header)

# Python (agent), STS GetCallerIdentity / DynamoDB / S3 / SecretsManager:
User-Agent: Boto3/1.43.9 ... app/uksb-wt64nei4u6#integ-1910531 Botocore/1.43.29 md/uksb-wt64nei4u6#agent
  • The app/uksb-wt64nei4u6#… segment was emitted natively by each SDK purely from the AWS_SDK_UA_APP_ID env var (zero client code) — confirming the PR's core "use # to keep the app-id native" claim.
  • Setting AWS_SDK_UA_APP_ID='' drops the app/ segment while md/ remains — confirms the documented customer opt-out.
  • The generic client (Lambda GetAccountSettings / STS GetCallerIdentity) carries the pair too — proving it rides on any client from spreading the helper, not just hand-picked ones.

Note on CloudTrail (why wire-capture)

CloudTrail's userAgent field is the same wire header, server-side — but DynamoDB data events are rejected account-wide in the test account (UnsupportedOperationException on both put-event-selectors and CloudTrail Lake create-event-data-store; the account is not in an Organization, so it's an account-level CloudTrail restriction, not an SCP). Management events (Bedrock InvokeModel, Cognito, STS, Secrets Manager) and S3 data events are observable via a trail; DynamoDB is not. Wire-capture proves the DynamoDB UA directly and is runtime-agnostic.


Test script — Node / Lambda tier (cdk/ua-wire-check.ts)
/**
 * UA wire-capture check (#319 / PR #345) — standalone, NOT part of the CDK app.
 *
 * Proves the outbound AWS SDK `User-Agent` carries both solution-attribution
 * segments WITHOUT relying on CloudTrail (this account blocks DynamoDB data
 * events, so the wire is the only place to observe them):
 *
 *     app/uksb-wt64nei4u6#{AWS_SDK_UA_APP_ID}   <- SDK reads the env var natively
 *     md/uksb-wt64nei4u6#{ABCA_COMPONENT}       <- from the REAL abcaUserAgent() helper
 *
 * It imports the PR's actual helper (src/handlers/shared/ua.ts) — no mirror —
 * spreads it into real SDK v3 clients exactly as the handlers do, attaches a
 * finalizeRequest middleware that captures the assembled User-Agent header, and
 * makes one cheap read-only call per service. The call may even fail on perms;
 * the UA is captured at request-build time, before the response, so failures
 * still print the header.
 *
 * Run (from the worktree cdk/ dir):
 *   AWS_PROFILE=default AWS_REGION=us-east-1 \
 *   AWS_SDK_UA_APP_ID='uksb-wt64nei4u6#integ-1910531' \
 *   ABCA_COMPONENT=orchestr \
 *   node_modules/.bin/ts-node --transpile-only --compiler-options '{"module":"commonjs"}' ua-wire-check.ts
 *
 * Vary ABCA_COMPONENT (api | orchestr | webhook | agent) to see each md/ label.
 * Set AWS_SDK_UA_APP_ID='' to confirm the app/ segment drops (customer opt-out).
 */

import { LambdaClient, GetAccountSettingsCommand } from '@aws-sdk/client-lambda';
import { LambdaClient, GetAccountSettingsCommand } from '@aws-sdk/client-lambda';
import { DynamoDBClient, ListTablesCommand } from '@aws-sdk/client-dynamodb';
import { S3Client, ListBucketsCommand } from '@aws-sdk/client-s3';
import {
  SecretsManagerClient,
  ListSecretsCommand,
} from '@aws-sdk/client-secrets-manager';
import { HttpRequest } from '@smithy/protocol-http';

// The REAL PR helper — this is the thing under test, not a copy.
import { abcaUserAgent, SOLUTION_ID, COMPONENT_ENV } from './src/handlers/shared/ua';

/**
 * Middleware that prints every User-Agent-ish header on the finalized request.
 * finalizeRequest runs AFTER the SDK's user-agent middleware (build step), so
 * the header is fully assembled — app/ from the env var + md/ from the helper.
 */
const captureUa = (label: string) => ({
  applyToStack: (stack: any) => {
    stack.add(
      (next: any) => async (args: any) => {
        const req = args.request;
        if (HttpRequest.isInstance(req)) {
          const ua =
            req.headers['user-agent'] ?? req.headers['User-Agent'] ?? '(none)';
          const xua =
            req.headers['x-amz-user-agent'] ??
            req.headers['X-Amz-User-Agent'] ??
            '(none)';
          // eslint-disable-next-line no-console
          console.log(`\n[${label}]`);
          // eslint-disable-next-line no-console
          console.log(`  User-Agent:        ${ua}`);
          // eslint-disable-next-line no-console
          console.log(`  x-amz-user-agent:  ${xua}`);
          const want = `md/${SOLUTION_ID}`;
          // eslint-disable-next-line no-console
          console.log(
            `  contains ${want}#... ? ${
              String(ua).includes(want) || String(xua).includes(want)
                ? 'YES'
                : 'NO'
            }`,
          );
        }
        return next(args);
      },
      { step: 'finalizeRequest', name: `captureUa-${label}`, priority: 'low' },
    );
  },
});

async function main(): Promise<void> {
  const appId = process.env.AWS_SDK_UA_APP_ID;
  const component = process.env[COMPONENT_ENV];
  // eslint-disable-next-line no-console
  console.log('=== UA wire-capture (#345) ===');
  // eslint-disable-next-line no-console
  console.log(`AWS_SDK_UA_APP_ID = ${appId ?? '(unset → no app/ segment)'}`);
  // eslint-disable-next-line no-console
  console.log(`${COMPONENT_ENV} = ${component ?? '(unset → defaults to api)'}`);
  // eslint-disable-next-line no-console
  console.log(`Expecting md/${SOLUTION_ID}#${component ?? 'api'} on every call.`);

  // Generic / "trivial" client — no specific resource, just a plain SDK v3
  // client built the same way (spread the helper). GetAccountSettings needs no
  // resource and minimal perms, so it's the Node-tier analogue of STS
  // GetCallerIdentity: proves the UA on an arbitrary client, not a special one.
  const lambda = new LambdaClient({ ...abcaUserAgent() });
  lambda.middlewareStack.use(captureUa('Lambda GetAccountSettings (generic)'));

  // Each client built EXACTLY like the handlers: spread the real helper in.
  const ddb = new DynamoDBClient({ ...abcaUserAgent() });
  ddb.middlewareStack.use(captureUa('DynamoDB ListTables'));

  const s3 = new S3Client({ ...abcaUserAgent() });
  s3.middlewareStack.use(captureUa('S3 ListBuckets'));

  const sm = new SecretsManagerClient({ ...abcaUserAgent() });
  sm.middlewareStack.use(captureUa('SecretsManager ListSecrets'));

  // Cheap read-only calls. Wrapped so a perms failure still lets the others run
  // (the UA is already printed by the middleware before any error surfaces).
  const run = async (name: string, fn: () => Promise<unknown>) => {
    try {
      await fn();
    } catch (err) {
      // eslint-disable-next-line no-console
      console.log(`  (${name} call errored after UA capture: ${(err as Error).name})`);
    }
  };

  await run('lambda', () => lambda.send(new GetAccountSettingsCommand({})));
  await run('ddb', () => ddb.send(new ListTablesCommand({ Limit: 1 })));
  await run('s3', () => s3.send(new ListBucketsCommand({})));
  await run('sm', () => sm.send(new ListSecretsCommand({ MaxResults: 1 })));
}

void main();
Test script — Python / agent tier (agent/ua_wire_check.py)
"""UA wire-capture check for the AGENT (Python) tier — #319 / PR #345.

Counterpart to ``cdk/ua-wire-check.ts`` (the Lambda/Node tier). Proves the
agent runtime's outbound boto3 ``User-Agent`` carries both solution-attribution
segments WITHOUT relying on CloudTrail (this account blocks DynamoDB data
events, so the wire is the only place to observe them):

    app/uksb-wt64nei4u6#{AWS_SDK_UA_APP_ID}   <- botocore reads the env var natively
    md/uksb-wt64nei4u6#agent                  <- from the REAL agent helper (ua.py)

It imports the agent's actual helper (``agent/src/ua.py``) — no mirror — builds
clients exactly like ``aws_session.platform_client`` (spreading ``client_config()``),
registers a botocore ``before-send`` handler to capture the fully-assembled
request headers, and makes one cheap read-only call per service. ``before-send``
fires with the signed request in hand; the UA is captured before the response,
so a perms failure still prints the header.

Run (from the worktree, with the agent venv that has boto3):

    cd agent
    AWS_PROFILE=default AWS_REGION=us-east-1 \
    AWS_SDK_UA_APP_ID='uksb-wt64nei4u6#integ-1910531' \
    .venv/bin/python ua_wire_check.py

Set AWS_SDK_UA_APP_ID='' to confirm the app/ segment drops (customer opt-out).
The md/ component is hard-wired to ``agent`` in ua.py (this surface IS the
agent), unlike the Node tier where ABCA_COMPONENT selects api/orchestr/webhook.
"""

from __future__ import annotations

import os
import sys

import boto3
from botocore.exceptions import BotoCoreError, ClientError

# Make ``agent/src`` importable when run from ``agent/``.
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "src"))

# The REAL agent helper — the thing under test, not a copy.
from ua import COMPONENT, SOLUTION_ID, client_config  # noqa: E402


def _make_capture(label: str):
    """Return a botocore ``before-send`` handler that prints the wire UA."""

    def _capture(request, **_kwargs):  # noqa: ANN001
        ua = request.headers.get("User-Agent") or request.headers.get("user-agent") or "(none)"
        if isinstance(ua, (bytes, bytearray)):
            ua = ua.decode("utf-8", "replace")
        want = f"md/{SOLUTION_ID}"
        print(f"\n[{label}]")
        print(f"  User-Agent: {ua}")
        print(f"  contains {want}#... ? {'YES' if want in ua else 'NO'}")
        return None  # don't short-circuit; let the real request proceed

    return _capture


def _client(service: str, label: str):
    """boto3 client built like aws_session.platform_client + a UA capture hook."""
    client = boto3.client(service, config=client_config())
    # 'before-send.<service>' fires once the request is fully built & signed.
    client.meta.events.register("before-send", _make_capture(label))
    return client


def main() -> None:
    app_id = os.environ.get("AWS_SDK_UA_APP_ID")
    print("=== UA wire-capture: AGENT tier (#345) ===")
    print(f"AWS_SDK_UA_APP_ID = {app_id if app_id is not None else '(unset -> no app/ segment)'}")
    print(f"Expecting md/{SOLUTION_ID}#{COMPONENT} on every call.")

    calls = [
        ("STS GetCallerIdentity", "sts", lambda c: c.get_caller_identity()),
        ("DynamoDB ListTables", "dynamodb", lambda c: c.list_tables(Limit=1)),
        ("S3 ListBuckets", "s3", lambda c: c.list_buckets()),
        ("SecretsManager ListSecrets", "secretsmanager", lambda c: c.list_secrets(MaxResults=1)),
    ]

    for label, service, op in calls:
        client = _client(service, label)
        try:
            op(client)
        except (ClientError, BotoCoreError) as err:
            # UA already printed by the before-send hook; note the call outcome.
            print(f"  ({service} call errored after UA capture: {type(err).__name__})")


if __name__ == "__main__":
    main()

How to run

# Node / Lambda tier (from cdk/) — let ts-node use the project tsconfig
# (do NOT pass --compiler-options module=commonjs; conflicts with moduleResolution=NodeNext)
AWS_REGION=us-east-1 AWS_SDK_UA_APP_ID='uksb-wt64nei4u6#<stack>' ABCA_COMPONENT=orchestr \
  node_modules/.bin/ts-node --transpile-only ua-wire-check.ts

# Python / agent tier (from agent/, agent venv with boto3)
AWS_REGION=us-east-1 AWS_SDK_UA_APP_ID='uksb-wt64nei4u6#<stack>' \
  .venv/bin/python ua_wire_check.py

@scottschreckengaust

Copy link
Copy Markdown
Contributor Author

❓ Follow-up: is there an intentional asymmetry in the opt-out? (app/ can be disabled, md/ cannot)

While verifying the wire output I tested AWS_SDK_UA_APP_ID='' and observed that only the app/ segment drops — the md/ segment always remains on both tiers:

# Node (ABCA_COMPONENT=orchestr), AWS_SDK_UA_APP_ID='':
User-Agent: aws-sdk-js/3.1068.0 ... m/E,w,v md/uksb-wt64nei4u6#orchestr
            (no app/ segment; md/ unchanged)

# Python (agent), AWS_SDK_UA_APP_ID='':
User-Agent: Boto3/1.43.9 ... cfg/retry-mode#legacy Botocore/1.43.29 md/uksb-wt64nei4u6#agent
            (no app/ segment; md/ unchanged)

This is consistent with the design — the two segments come from independent sources:

  • app/ ← the SDK reading AWS_SDK_UA_APP_ID natively. buildAppId() returns undefined for an empty override, SolutionUaAspect then no-ops, so the env var is never set → no app/. (CLI: applyDefaultAppId() never overrides an existing value, incl. ''.)
  • md/ ← baked unconditionally into each client at construction by abcaUserAgent() / static_user_agent_extra(), with no env var or context flag gating it.

Question

Is "no opt-out for md/" intentional? A couple of readings:

  • If yes (likely): md/uksb-wt64nei4u6#{component} is the bare anonymous solution marker; a customer opting out of the per-deployment app/uksb-wt64nei4u6#{stackName} correlation tag still emits the anonymous marker. Defensible — opting out of identifying your deployment ≠ opting out of the solution being counted. If that's the intent, it'd be worth a one-line note in the docs/PR body so "customer opt-out" isn't read as "suppress all attribution."
  • If a full suppression switch is in scope: it doesn't exist today. Mechanically it looks small — gate the three helpers (cdk/src/handlers/shared/ua.ts, cli/src/ua.ts, agent/src/ua.py) to return an empty config when a disable flag is set. Call sites already spread the helper (new Client({ ...abcaUserAgent() })), so an empty return is a no-op and no call site changes. The care is keeping all three helpers behaviorally identical (the parity the docstrings require) and adding an opt-out assertion to the existing wire-capture tests.

No code change requested here — just confirming whether md/-suppression is in scope or intentionally omitted, so the opt-out wording matches the behavior.

@scottschreckengaust

Copy link
Copy Markdown
Contributor Author

💡 Follow-up (design): make AWS_SDK_UA_APP_ID at synth the single source of truth for all three attribution surfaces

Extending the opt-out thread above — there's a third attribution surface beyond the two wire segments: the CloudFormation stack description (#292), which today hardcodes the solution id in cdk/src/main.ts:

description: 'ABCA Development Stack (uksb-wt64nei4u6)',   // synth-time string literal

So the solution id uksb-wt64nei4u6 is currently a literal repeated across four placescdk/src/handlers/shared/ua.ts, cli/src/ua.ts, agent/src/ua.py (the md/ helpers), and main.ts (the description) — and the app/ value is separately derived in buildAppId().

The three surfaces and where each gets the id today

Surface Where the id comes from Honors an override?
app/{id}#{stackName} (wire) buildAppId()AWS_SDK_UA_APP_ID env set by SolutionUaAspect -c sdkUaAppId=... (and '' opts out)
md/{id}#{component} (wire) hardcoded SOLUTION_ID in each helper ❌ literal
({id}) (stack description) hardcoded literal in main.ts ❌ literal

Proposed ideal behavior

A customer provides the solution id once at synth time and that value is used consistently across all three surfaces. Either input path is acceptable and they can share one resolver — the synth-time AWS_SDK_UA_APP_ID env var or the existing -c sdkUaAppId=... context flag (with the same precedence buildAppId already uses):

  • unset → default uksb-wt64nei4u6 everywhere (today's behavior, unchanged);
  • set to myco-xyzapp/myco-xyz#{stackName}, md/myco-xyz#{component}, and description (myco-xyz);
  • -c sdkUaAppId='' (or AWS_SDK_UA_APP_ID='') → no app/, no md/, and a plain description (full opt-out).

This collapses four hardcoded literals into one synth-time input and makes the opt-out coherent (one switch governs all three, instead of -c sdkUaAppId='' silencing only app/ while md/ and the description still carry uksb-wt64nei4u6).

Feasibility / notes (no change requested — just scoping the conversation)

  • Synth-time only. main.ts is plain Node, so process.env.AWS_SDK_UA_APP_ID and tryGetContext are available there; the description is assembled before the template is written. CloudFormation's Description field itself is a static string (no Ref/Fn::Sub), so this must be resolved at synth — which is exactly where the value already lives.
  • Today AWS_SDK_UA_APP_ID carries the full app-id (uksb#stack), not just the solution-id prefix. This proposal would treat the synth-time env/context value as the solution id and let buildAppId() continue composing #{stackName}. Worth deciding explicitly: is the customer input the solution id (prefix) or the entire app-id? They differ for the md/ and description surfaces.
  • Ordering: the description is built on main.ts:42, but the override is read on :50. Gating the description means hoisting that read above stack construction — small, but a real reorder.
  • Parity: the three md/ helpers must consume the override identically (the cross-surface parity the docstrings already require), and the wire-capture tests would gain an override + opt-out assertion.

Does treating a single synth-time AWS_SDK_UA_APP_ID (solution-id) as the source of truth for app/ + md/ + description fit the intended design for #319/#292?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(observability): inject solution into outbound AWS SDK User-Agent

1 participant