Skip to content

fix(cdk): pin VPC to AgentCore-supported availability zones#358

Open
AshrafBen10 wants to merge 5 commits into
aws-samples:mainfrom
AshrafBen10:fix/353-agentcore-supported-azs
Open

fix(cdk): pin VPC to AgentCore-supported availability zones#358
AshrafBen10 wants to merge 5 commits into
aws-samples:mainfrom
AshrafBen10:fix/353-agentcore-supported-azs

Conversation

@AshrafBen10

Copy link
Copy Markdown

Summary

Adds an availabilityZones prop to AgentVpc and wires a CDK context key (agentcore:availabilityZones) so affected accounts can pin the VPC subnets to AZ names that map to AgentCore-supported physical zone IDs.

Problem

AgentCore only supports a subset of physical AZs per region (for us-east-1: use1-az1, use1-az2, use1-az4). AZ names are aliased per-account, so the default maxAzs: 2 selection can land in an unsupported zone, causing the AWS::BedrockAgentCore::Runtime resource to fail with NotStabilized and rolling back the entire stack.

Changes

  • cdk/src/constructs/agent-vpc.ts — Added optional availabilityZones prop that takes precedence over maxAzs when provided.
  • cdk/src/stacks/agent.ts — Reads agentcore:availabilityZones from CDK context and passes to AgentVpc.
  • cdk/test/constructs/agent-vpc.test.ts — Added tests for the new prop (explicit AZs override maxAzs, 3-zone case).

Usage for affected accounts

# Discover your AZ mapping
aws ec2 describe-availability-zones --region us-east-1 \
  --query 'AvailabilityZones[].[ZoneName,ZoneId]' --output text

# Deploy with pinned AZs (choose names mapping to use1-az1, use1-az2, use1-az4)
cdk deploy -c agentcore:availabilityZones='["us-east-1b","us-east-1c"]'

Or in cdk.context.json:

{
  "agentcore:availabilityZones": ["us-east-1b", "us-east-1c"]
}

Testing

  • All 15 AgentVpc tests pass (100% coverage)
  • TypeScript compiles cleanly

Closes #353

…zones (aws-samples#353)

AgentCore only supports a subset of physical availability zones per region.
AZ names are aliased per-account to physical zone IDs, so the default
maxAzs selection can land in a zone AgentCore does not support, causing the
AWS::BedrockAgentCore::Runtime resource to fail with NotStabilized.

Changes:
- Add optional `availabilityZones` prop to AgentVpcProps — when provided it
  takes precedence over maxAzs so the VPC is pinned to specific AZ names.
- Wire up the CDK context key `agentcore:availabilityZones` in agent.ts so
  affected accounts can set it in cdk.context.json or via -c flag without
  touching construct code.
- Add tests for the new prop (explicit AZs override maxAzs, 3-zone case).

Usage for affected accounts:
  cdk deploy -c agentcore:availabilityZones='["us-east-1b","us-east-1c"]'

Or in cdk.context.json:
  { "agentcore:availabilityZones": ["us-east-1b", "us-east-1c"] }

Closes aws-samples#353
@AshrafBen10 AshrafBen10 requested a review from a team as a code owner June 16, 2026 19:45
@AshrafBen10 AshrafBen10 requested a review from a team as a code owner June 16, 2026 20:54
@isadeks

isadeks commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Review: fix(cdk): pin VPC to AgentCore-supported availability zones (#353)

The fix is correct and does what it claims — verified empirically against aws-cdk-lib@2.257.0:

  • Works for the env-agnostic production stack. Passing explicit availabilityZones bakes the literal AZ names (us-east-1b, us-east-1c) straight into the CloudFormation subnets, whereas the original maxAzs path emits Fn::Select/Fn::GetAZs tokens that CloudFormation resolves nondeterministically at deploy time — exactly the cdk: deploy fails when VPC lands in an AgentCore-unsupported availability zone #353 failure mode. Pinning genuinely makes the deploy deterministic.
  • The spread approach is valid. VpcProps.availabilityZones is a real prop, mutually exclusive with maxAzs (CDK throws if both are set), and ...(cond ? {availabilityZones} : {maxAzs}) correctly passes exactly one.
  • Consistent with repo conventions — the tryGetContext(...) as T pattern matches how blueprintRepo, stackName, compute_type, and github:* are already read.

(I went in expecting the known CDK gotcha — a subset-of-stack-AZs validation throw on an env-agnostic stack — but reproduced that it does not throw in the production configuration. No correctness bug in the happy path.)

Two low-severity findings, neither blocking:

🔧 1. Unvalidated context cast (agent.ts:216)

const agentCoreAzs = this.node.tryGetContext('agentcore:availabilityZones') as string[] | undefined;

The as string[] cast trusts the operator to pass valid JSON. The intuitive shorthand -c agentcore:availabilityZones=us-east-1b (a bare string, not the JSON-array form the docs show) sails through the cast, gets spread into availabilityZones, and makes CDK throw:

this.availabilityZones.forEach is not a function

…a synth error that names neither the context key nor the expected shape. Since the operator hitting this is already mid-firefight over AZs (the whole point of the feature), a one-line guard is worth it:

if (agentCoreAzs !== undefined && !Array.isArray(agentCoreAzs)) {
  throw new Error("Context 'agentcore:availabilityZones' must be a JSON array of AZ names, e.g. -c agentcore:availabilityZones='[\"us-east-1b\",\"us-east-1c\"]'");
}

🔧 2. Test coverage gap (agent-vpc.test.ts:152)

The new tests deploy into a concrete-env stack (env: { region: 'us-east-1' }), but AgentStack is env-agnostic in production (main.ts:26, account/region from CDK_DEFAULT_*). The two branches happen to produce identical output today (I verified both bake in literal AZ names), so this isn't a current bug — but the tests don't exercise the path production actually synthesizes in, so a future CDK upgrade that changes env-agnostic AZ handling could regress production while these stay green. Worth adding an env-agnostic case that asserts the literal AZ names land in the subnets.


Both are nits — the PR is fundamentally sound and safe to merge.

Reviewed at xhigh effort. The fix mechanism (literal AZs vs Fn::GetAZs tokens), VpcProps.availabilityZones/maxAzs mutual exclusion, the env-agnostic synth behavior, and the bare-string crash all verified by reproduction against aws-cdk-lib@2.257.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cdk: deploy fails when VPC lands in an AgentCore-unsupported availability zone

3 participants