Skip to content

feat(docs): add markdown-link-check to docs CI (#259)#300

Open
ClintEastman02 wants to merge 24 commits into
aws-samples:mainfrom
ClintEastman02:feat/259-markdown-link-check
Open

feat(docs): add markdown-link-check to docs CI (#259)#300
ClintEastman02 wants to merge 24 commits into
aws-samples:mainfrom
ClintEastman02:feat/259-markdown-link-check

Conversation

@ClintEastman02

@ClintEastman02 ClintEastman02 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds markdown-link-check as a devDependency in the docs package
  • New link-check CI job runs on pull requests touching documentation, validating internal cross-references and external URLs across source Markdown (docs/guides/, docs/design/, docs/decisions/, root .md files)
  • Allowlist config (.markdown-link-check.json) handles known-flaky external hosts (npmjs bot-blocking, rate-limited APIs) and intentional patterns (Starlight site paths, placeholder text)
  • New mise //docs:link-check task for local runs
  • Documented in the Developer Guide
  • Bonus: fixed a pre-existing broken anchor in docs/design/WORKFLOWS.md (caught by the new checker)

Closes #259.

Test plan

  • Full link-check passes locally (46 files, 0 broken links)
  • Docs sync + build pass cleanly
  • CI green on this PR

Adds a broken-link checker that validates internal cross-references and
external URLs across all Markdown sources at PR time, preventing silent
link rot in documentation that also serves as agent context.
@ClintEastman02 ClintEastman02 requested a review from a team as a code owner June 9, 2026 20:44
@codecov-commenter

codecov-commenter commented Jun 9, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@0e2806a). Learn more about missing BASE report.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #300   +/-   ##
=======================================
  Coverage        ?   88.16%           
=======================================
  Files           ?      201           
  Lines           ?    48868           
  Branches        ?     4444           
=======================================
  Hits            ?    43085           
  Misses          ?     5783           
  Partials        ?        0           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread .github/workflows/docs.yml Outdated
Comment thread docs/mise.toml

@krokoko krokoko left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking up CA-12 — this is exactly the kind of cheap fitness function the docs need, and the WORKFLOWS.md anchor fix already proves the gate has real targets. A few findings beyond the existing comments (the suggestions to run the CI step through mise and fold it into drift-prevention would also resolve the command duplication and the concurrency concern, so I've left those out):

1. The pipeline fails open — the gate can pass while checking nothing (docs/mise.toml:31, docs/package.json:12)
find … | xargs markdown-link-check runs without pipefail, so the pipeline's exit code is xargs's alone. If find errors (e.g. a docs directory is renamed) the job still goes green — reproducible with sh -c "find /nonexistent | xargs echo"; echo $?0. Suggested hardening:

  • run under bash -o pipefail (or set -euo pipefail in the script),
  • use find … -print0 | xargs -0 -r (NUL-safe, and -r skips the run on empty input instead of invoking the checker with no args),
  • assert a minimum file count before checking (corpus is ~46 today) so a shrunk scan fails loudly rather than silently passing.

2. The aws-cdk bootstrap ignore pattern hides a genuinely dead link (docs/.markdown-link-check.json:9)
The ignored URL is used in docs/decisions/ADR-002-least-privilege-bootstrap-policies.md:71 and currently returns 404 (the bootstrap code moved within the aws-cdk repo). This is the exact rot class the gate exists to catch, suppressed on day one. Suggest repointing ADR-002 to a SHA-pinned permalink (or the published CDK docs) and deleting the ignore entry — allowlist entries are best reserved for categorically-unverifiable URLs.

3. aliveStatusCodes treats 301/302/307/308 as alive (docs/.markdown-link-check.json:23)
A permanently-moved link whose destination 404s will still pass. Suggest dropping the 3xx codes and letting the checker follow redirects to validate the terminal status.

4. Blanket npmjs.com ignore (docs/.markdown-link-check.json:6)
A typo'd package URL passes silently, and retryOn429 already covers the rate-limiting that motivates this. Worth narrowing or removing.

5. -maxdepth 1 will silently drop future subdirectories (docs/mise.toml:31, docs/package.json:12)
guides/, design/, decisions/ are flat today so coverage is currently complete, but the first nested folder vanishes from the check with no signal. Either drop -maxdepth 1 for the named dirs (keep it only on ..), or rely on the min-file-count assertion from point 1. Related: docs/README.md is never scanned (no . start point).

6. The failure path was never demonstrated
The test plan validates only the green path. Since the gate's entire value is in failing, it would be great to link a throwaway commit with a deliberately broken link showing the job go red — that also empirically confirms point 1's exit-code handling.

Nit: the ^link$ ignore is legitimate (it masks the intentional placeholder in ADR-004's example prose) but opaque — a one-line note in the config or Developer Guide would save a future maintainer the archaeology.

One consideration for the drift-prevention suggestion: external-URL checking is non-hermetic (network flakiness, transient 5xx), so folding it into the main build puts that flakiness on every build's critical path. Splitting internal-link checking (hermetic, belongs in build/drift-prevention) from external-URL checking (scheduled or PR-only) might be the sweet spot.

Happy to discuss any of these — points 1 and 2 are the ones I'd consider blocking.

@krokoko krokoko added the approved When an issue has been approved and ready label Jun 10, 2026
@ClintEastman02

ClintEastman02 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review @krokoko — addressed all points:

Inline comments (drift-prevention integration):

  • Removed the dedicated link-check job from docs.yml — no more separate CI step
  • Added { task = "//docs:link-check" } to root drift-prevention task so it runs as part of every mise run build
  • docs/package.json script now delegates to mise run link-check

1. [BLOCKING] Pipeline fails open:

  • set -euo pipefail (bash strict mode)
  • Temp-file approach (bash variables can't hold NUL bytes, so -print0 in a subshell doesn't work — used mktemp with trap cleanup instead)
  • Min-file-count assertion (< 10 → error + exit 1)
  • xargs -r to skip invocation on empty input

2. [BLOCKING] aws-cdk dead link:

  • Replaced dead GitHub link in ADR-002 with live AWS docs URL (https://docs.aws.amazon.com/cdk/v2/guide/bootstrapping-env.html)
  • Removed the aws-cdk ignore pattern from config

3. aliveStatusCodes:

  • Dropped 3xx codes — now [200, 206] only, checker follows redirects to validate terminal status

4. Blanket npmjs.com ignore:

  • Narrowed from www.npmjs.com to www.npmjs.com/package/ — npmjs returns 403 (not 429) to automated requests so retryOn429 doesn't help, but typo'd non-package URLs will now get caught

5. -maxdepth 1:

  • Already resolved — named dirs (guides, design, decisions) have no depth limit; -maxdepth 1 only applies to root .. (correct behavior)

6. Failure path demonstration:

  • Will push a throwaway commit with a broken link to show the gate going red

7. Nit (^link$ opaque):

  • Added _comment field explaining it masks the intentional placeholder in ADR-004 example prose

Architecture (internal vs external split):

  • Acknowledged — since external URL checking is non-hermetic, this may need splitting in a follow-up if flakiness becomes an issue. For now retryCount: 3 + fallbackRetryDelay: 10s provides some resilience.

ClintEastman02 and others added 5 commits June 10, 2026 13:40
…prevention

Remove dedicated link-check CI job from docs.yml and wire it into the
existing drift-prevention task so it runs as part of every build. Harden
the script with pipefail, temp-file approach, min-file-count assertion,
and xargs -r. Fix dead aws-cdk link in ADR-002, narrow npmjs ignore to
package paths only, drop 3xx from aliveStatusCodes, and document the
^link$ ignore pattern.
@krokoko

krokoko commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Thanks @ClintEastman02 — nice work on this, and thanks for the thorough follow-up on the earlier review. I verified the three fail-open fixes empirically (the set -e brace-group abort, the min-file-count tripwire, and xargs -r) and they all genuinely hold. CI is green with the gate running against the real corpus. Summary of a full review pass below.

Verdict: approve with one small required fix.

Required before merge

  • docs/guides/DEVELOPER_GUIDE.md:377 (+ Starlight mirror) — the new sentence says the check runs in CI "on pull requests that touch documentation," but after moving the gate into drift-prevention, it runs via mise run build in build.yml, which triggers on every PR (no paths: filter). The prose predates the rewiring. Suggest: "The same check runs automatically in CI on every pull request, as part of the build's drift-prevention step." — and regenerate the mirror with mise //docs:sync.

Non-blocking suggestions

  • docs/.markdown-link-check.json:11 — the ^/{{BASEURL}}/ replacement references a variable that is never defined. It's dead code today (the ^/sample-autonomous-cloud-coding-agents/ ignore short-circuits first), but the first future root-absolute link would be rewritten to a literal {{BASEURL}}/... and fail confusingly. Cleanest to drop the replacementPatterns block.
  • docs/mise.toml — the newline-delimited find | xargs handoff word-splits on spaces; a future My Guide.md would crash the run with an opaque ENOENT (loud, not fail-open, but confusing). find … -print0 + xargs -r -0 would future-proof it.
  • Scope note — the scan matches issue feat(docs): add markdown-link-check/linkinator to docs CI (CA-12) #259's acceptance criteria exactly, but docs/README.md, agent/README.md, cli/README.md, contracts/**, and docs/abca-plugin/ are outside it. Worth either a follow-up issue or a one-line comment in the task declaring the scope intentional.
  • mise.toml drift-prevention description — "drift in contracts between packages" no longer quite covers a docs link checker; consider broadening to "…and documentation links."
  • Known limitation#fragment anchors aren't validated by markdown-link-check (~83 in scope); a renamed heading rots silently. remark-validate-links or markdownlint MD051 would be a good follow-up.
  • Red-path demo (point 6 from the earlier review) — the throwaway broken-link commit was never pushed. I verified the failure semantics empirically instead (dead link anywhere in the batch → exit 1; malformed config → exit 1; redirects followed, so aliveStatusCodes [200, 206] doesn't false-fail 3xx), so I'd consider this superseded — just noting it for the record.
  • The external-URL non-hermeticity concern stands now that the check is on every build's critical path; retryCount: 3 softens it, and the internal/external split already discussed remains the right escape hatch if flakiness shows up.

Verified accurate along the way: both _comment fields (npmjs does return 403, and ^link$ matches exactly the ADR-004 placeholder), the WORKFLOWS.md anchor fix, mirror sync, and the caret-pin convention. Fix the one sentence and this is good to go.

The check runs on every PR via the build's drift-prevention step, not
only on PRs that touch documentation.
@ClintEastman02

Copy link
Copy Markdown
Contributor Author

Fixed — updated the developer guide sentence to: "The same check runs automatically in CI on every pull request, as part of the build's drift-prevention step." Starlight mirror regenerated.

- Remove dead replacementPatterns block ({{BASEURL}} was never defined)
- Use find -print0 + xargs -0 for space-safe path handling
- Add scope comment to link-check task documenting what is/isn't scanned
- Broaden drift-prevention description to cover documentation links
@ClintEastman02

Copy link
Copy Markdown
Contributor Author

Also addressed the non-blocking suggestions:

  • Removed the dead replacementPatterns block ({{BASEURL}} was never defined and the /sample-autonomous-cloud-coding-agents/ ignore short-circuits first anyway)
  • Switched to find -print0 + xargs -0 for space-safe path handling
  • Added scope comment to the link-check task documenting what is and isn't scanned
  • Broadened drift-prevention description to include documentation links

Comment thread docs/package.json Outdated
ayushtr-aws
ayushtr-aws previously approved these changes Jun 16, 2026
@ayushtr-aws ayushtr-aws requested a review from a team as a code owner June 16, 2026 20:36

@isadeks isadeks left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid, useful addition — the script is careful (set -euo pipefail, mktemp+trap
cleanup, a ≥10-file sanity floor), the anchor fix is real (I confirmed ###
Domain & \requires_reposlugifies todomain--requires_repo; the old #domain--requiresrepogenuinely 404s), and I verified empirically that the tool exits nonzero on broken links in multi-file mode and that failure propagates throughxargsunderpipefail. No correctness bugs in the script itself.

The findings are about integration design and over-claimed coverage, ranked by
impact:

Worth addressing before merge:

  1. Every-PR blocking (mise.toml:104) — wiring this into drift-prevention (a
    hard dep of mise run build, which runs on all PRs with no paths: filter) means
    live external-URL checks gate every PR, not "docs PRs" as the body says. This
    is the one most likely to cause unrelated red checks from transient
    network/bot-block flakiness. Consider a paths:-filtered docs job (matching the
    body's intent) rather than the universal build path, or restricting CI to
    internal links.
  2. aliveStatusCodes:[200,206] (config:19) — this overrides the default alive
    set; I confirmed a 301 endpoint is marked dead. Legitimate
    redirect/permanent-move links will fail the build.

Coverage is narrower than the PR claims (the "validates internal
cross-references" line):
3. Cross-file #anchors are not validated — only the file is (verified).
Same-file anchors are.
4. The ^/sample-autonomous-cloud-coding-agents/ ignore disables checking for
all site-absolute internal links.
5. Hard-coded find guides design decisions silently omits docs/abca-plugin/
and any future subtree.

Nit:
6. ^link$ ignore is broader than the one ADR-004 placeholder it's meant for.

None are crashes; the highest-value fixes are #1 (scope the trigger so it
doesn't gate every PR) and #3/#4 (either tighten the coverage claim in the
docs or acknowledge the anchor/internal-link gap). Want me to post these as
inline PR comments, or draft a single summary comment in the blockers/nits
format?

scottschreckengaust added a commit that referenced this pull request Jun 17, 2026
The doc lives in draft PR #364 and exists on neither this branch nor
main, so the three links (disable-bundling.ts, AGENTS.md, review_pr.md)
are dead. Drop them and keep the stable #366 reference, removing the
merge-order dependency on #364 and the link-checker (#300) risk.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ClintEastman02 and others added 2 commits June 17, 2026 11:12
External http(s) URLs are now ignored so network/bot-block flakiness
cannot fail an unrelated PR, since the gate runs on every build via
drift-prevention. Removes aliveStatusCodes and the now-dead external
retry/header options, and updates the Developer Guide to match.
@ayushtr-aws ayushtr-aws enabled auto-merge June 17, 2026 18:13
isadeks pushed a commit to isadeks/sample-autonomous-cloud-coding-agents that referenced this pull request Jun 18, 2026
… synth) (aws-samples#366) (aws-samples#371)

* perf(test): disable Lambda bundling in CDK unit-test synths (aws-samples#366)

Template.fromStack() triggers a full CDK synth that bundles every
NodejsFunction via esbuild — ~28s for the 413-resource AgentStack. Unit
tests assert on CloudFormation structure, not bundled Lambda code, so the
bundling is pure overhead. A jest setupFiles hook sets
aws:cdk:bundling-stacks: [] via CDK_CONTEXT_JSON, which a bare new App()
picks up with no call-site changes.

Measured: single AgentStack synth 28.7s -> 1.9s (~15x); github-tags.test.ts
~135s -> 9s; full //cdk:test 155s -> 25s (8-core local). All 2177 tests
pass; coverage thresholds unchanged (lines 92.36 >= 91).

Durable regression guards added: AGENTS.md "Common mistakes" entry and a
.abca/commands/review_pr.md checklist item — do not re-enable bundling in
tests unless asserting on a bundled asset, and minimize full-stack synths
(synth once in beforeAll). Both link docs/design/CI_BUILD_PERFORMANCE.md
(PR aws-samples#364).

Refs aws-samples#366. Part of aws-samples#363. Likely supersedes the sharding approach (aws-samples#365) —
to be re-evaluated once this lands.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(test): correct bundling opt-out to postCliContext (aws-samples#366)

The documented per-test opt-out used constructor `context`, which CDK
silently overwrites with CDK_CONTEXT_JSON in App.loadContext — so
`new App({ context: { 'aws:cdk:bundling-stacks': ['**'] } })` does NOT
re-enable bundling. Only `postCliContext` (applied last) overrides the
global disable.

Corrects the opt-out instruction and precedence comment in all three
documented locations (disable-bundling.ts, AGENTS.md, review_pr.md).
Verified empirically against aws-cdk-lib@2.257.0. Comment/doc-only;
runtime logic unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(test): clarify bundling opt-out is for asset output, not synth (aws-samples#366)

Disabling bundling does not stop tests from synthesizing —
`Template.fromStack()` still runs a full synth; it only skips the
esbuild asset-bundling step. The `postCliContext` opt-out exists solely
for the rare test that needs real bundled-asset output (asset hash /
S3 key), where an unbundled synth would silently yield a placeholder.
Makes this explicit in disable-bundling.ts and AGENTS.md. Doc-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(test): drop dead CI_BUILD_PERFORMANCE.md links (aws-samples#366)

The doc lives in draft PR aws-samples#364 and exists on neither this branch nor
main, so the three links (disable-bundling.ts, AGENTS.md, review_pr.md)
are dead. Drop them and keep the stable aws-samples#366 reference, removing the
merge-order dependency on aws-samples#364 and the link-checker (aws-samples#300) risk.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(test): harden bundling-disable against ambient CDK_CONTEXT_JSON (aws-samples#366)

Two robustness fixes to the global bundling-disable setup file, both from
PR aws-samples#371 review (finding 1 hardening nits):

- Spread BUNDLING_DISABLED_CONTEXT LAST when merging a pre-set
  CDK_CONTEXT_JSON, so an ambient `aws:cdk:bundling-stacks` value can no
  longer silently win and re-enable bundling. This is safe precisely because
  the documented per-test opt-out is `postCliContext` (applied after the env
  var regardless), so the escape hatch still works.
- Guard `JSON.parse` of the pre-set var with try/catch. A malformed value
  previously threw inside setupFiles and failed every test in every file with
  an opaque parse error; now the blame is localized to the offending value.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(cdk): add executable guard that Lambda bundling is disabled (aws-samples#366)

PR aws-samples#371 review finding 1: the ~15× synth speedup rested entirely on the
`disable-bundling.ts` setupFiles entry, but the only regression guards were
honor-system prose (AGENTS.md) and a review_pr.md checklist item. Dropping the
setupFiles wiring, reordering Jest setup, or a CDK rename of the context key
would silently revert the suite to full-bundling synths with no failing check.

Adds two assertions: the synth context carries `aws:cdk:bundling-stacks: []`,
and a bare-App stack reports `bundlingRequired === false`. Verified empirically
against aws-cdk-lib@2.259.0 that without the disable a bare App yields
`bundlingRequired === true`, so these discriminate a real regression.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(cdk): cover the postCliContext bundling opt-out precedence (aws-samples#366)

PR aws-samples#371 review finding 2: the documented per-test opt-out (`postCliContext`)
and the `loadContext` precedence this PR corrected had zero test coverage, so a
future CDK version reordering context application could silently break the one
supported escape hatch.

Locks the contract with two assertions verified against aws-cdk-lib@2.259.0:
`postCliContext: { 'aws:cdk:bundling-stacks': ['**'] }` re-enables bundling
(`bundlingRequired === true`), while constructor `context` does NOT (the env
var set by the global disable clobbers it). Uses a bare `Stack` rather than a
heavy AgentStack since the precedence is a property of CDK context loading, not
of any particular stack.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(test): reframe bundling as overhead-plus-smoke-test, not "pure overhead" (aws-samples#366)

PR aws-samples#371 review finding 3: "pure overhead" undersold the tradeoff. esbuild
bundling also validates that every Lambda `entry` resolves and the handler
compiles, so disabling it in unit tests means a broken entry path or handler TS
error stops surfacing at `Template.fromStack()` and only fails at real
synth/deploy or in the `agent/` build. The tradeoff is still acceptable — real
synth/deploy bundles, and handler code has its own compile/test under `agent/`
— but the wording now names what unit tests cede. Updated in both places the
phrase lived: the disable-bundling.ts header and AGENTS.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: bgagent <345885+scottschreckengaust@users.noreply.github.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved When an issue has been approved and ready

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(docs): add markdown-link-check/linkinator to docs CI (CA-12)

5 participants