Skip to content

Conversation

@dylanratcliffe
Copy link
Member

Summary

  • Narrow internal ingress CIDR used for service/monitoring access.

Context

  • JIRA-4521: Reduce internal exposure based on audit feedback.

Testing

  • Terraform plan reviewed in CI.

Rollout / Risk

  • If any internal tooling relies on the broader range, it may lose access; monitor health checks and alarms after merge.

@github-actions
Copy link

Open in Overmind ↗


model|risks_v6
✨Encryption Key State Risk ✨KMS Key Creation

🟢 Change Signals

Routine 🟢 ▁▂ Ingress resources showing regular updates with 1 event/day for the last 7 weeks and 2 events/day for the last day.

View signals ↗


🔥 Risks

Narrowing SG to 10.0.0.0/16 will block cross‑VPC NLB health checks to 10.0.101.222:9090, breaking internal monitoring ❗Medium Open Risk ↗
The internal-services security group will restrict ingress on ports 9090, 443, and 8080 to 10.0.0.0/16. The instance using this SG (production-api-server at 10.0.101.222) is currently being health-checked and scraped over 9090 by the internal Network Load Balancer mon-internal-terraform-example that resides in a separate VPC with CIDR 10.50.0.0/16.

When the rule narrows to 10.0.0.0/16, connections originating from the NLB nodes in 10.50.0.0/16 will be blocked at the instance’s SG. The api-health-terraform-example target (10.0.101.222:9090) will become unhealthy, breaking internal health checks/metrics collection behind that NLB and likely triggering related CloudWatch alarms. Route53 HTTPS health checks are already timing out today and are not newly introduced by this change.


🧠 Reasoning · ✖ 1 · ✔ 1

Narrowing internal SG ingress CIDRs from /8 to /16 may block health checks and internal traffic

Observations 20

Hypothesis

Security group ingress CIDRs are being narrowed from 10.0.0.0/8 to 10.0.0.0/16 (including on sg-089e5107637083db5) for ports 8080, 443, and 9090. This reduces allowed internal and external source IP ranges so that hosts in 10.0.0.0/8 but outside 10.0.0.0/16, as well as other health-check or monitoring sources outside the /16, may no longer reach instances and ENIs using these security groups. Impacted flows include internal HTTPS, inter-service and service mesh traffic, Prometheus scraping, ALB/ELB target health checks, and Route53 HTTPS health checks on port 443. Route53 health check probes originating from IPs outside 10.0.0.0/16 can be blocked, causing health check failures, CloudWatch alarms, false-positive failover events, and potential traffic loss or downtime for endpoints behind this SG.

Investigation

I reviewed the diff for sg-089e5107637083db5 and it narrows ingress on ports 8080/443/9090 from 10.0.0.0/8 to 10.0.0.0/16. Using blast-radius data, this SG is attached to i-084178432f016fcd2 (private 10.0.101.222). There is an internal Network Load Balancer mon-internal-terraform-example in a different VPC with CIDR 10.50.0.0/16. Its target group api-health-terraform-example performs TCP health checks to 10.0.101.222:9090 and is currently healthy, proving cross‑VPC connectivity works now. Because the NLB lives in 10.50.0.0/16, its node IPs originate from 10.50.x.x. Today these are allowed by the SG’s broad 10.0.0.0/8 rule; after narrowing to 10.0.0.0/16 they will be blocked, causing the NLB target to flip unhealthy and breaking internal monitoring/scraping via that NLB. Route53 HTTPS health checks cited in the hypothesis are already failing before this change due to the SG only allowing 443 from 10.0.0.0/8; the proposed change doesn’t newly cause those, but the cross‑VPC NLB/monitoring path will newly break. The internet-facing ALB and its instance use different SGs and are not impacted.

✔ Hypothesis proven


Expanded external ingress to TCP 443 for specific IP on sg-03cf38efd953aa056

Observations 3

Hypothesis

Opening security group sg-03cf38efd953aa056 to allow inbound TCP 443 from CIDR 203.0.113.139/32 expands external access to the production API. This increases attack surface by permitting an additional external IP to reach resources protected by this SG. If the IP is misconfigured or compromised, it could allow unauthorized access and may impact security posture and compliance controls tied to restricted ingress sources.

Investigation

The diff adds a single new /32 entry 203.0.113.139/32 (NewCo 39) to security group customer-api-access on TCP 443. Blast radius shows this SG already whitelists many external /32s for 443 (NewCo 1–38 plus partner CIDRs) and is explicitly described and tagged as a customer IP whitelist that is updated frequently. The SG is attached to the production API instance i-084178432f016fcd2, which already permits 443 from these sources. Adding one more customer IP does not introduce a new exposure path or misconfiguration; it minimally extends an existing, intentional allowlist. There is no evidence of violated guardrails, incompatible settings, or operational failures tied to this change. The hypothesis relies on generic speculation about potential compromise rather than a concrete failure mechanism. Therefore, no actionable risk is identified for deployment.

✖ Hypothesis disproven


💥 Blast Radius

Items 73

Edges 257

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overmind

✅ Auto-Approved


🟢 Decision

Auto-approved: All safety checks passed


📊 Signals Summary

Routine 🟢 +2


🔥 Risks Summary

High 0 · Medium 1 · Low 0


💥 Blast Radius

Items 73 · Edges 257


View full analysis in Overmind ↗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants