Skip to content

feat(test-seeding): add gated POST /test/seed for e2e test resource seeding#301

Open
dm36 wants to merge 4 commits into
mainfrom
dhruv/test-seeding-endpoint
Open

feat(test-seeding): add gated POST /test/seed for e2e test resource seeding#301
dm36 wants to merge 4 commits into
mainfrom
dhruv/test-seeding-endpoint

Conversation

@dm36

@dm36 dm36 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a test-only seeding endpoint at `POST /test/seed` so the e2e suite (scaleapi/agentex#378) can directly insert resource rows for FGAC assertions, bypassing the natural-flow side effects (ACP forward, worker pickup, etc.).

The endpoint writes via the existing repository layer — same path the natural flow takes — so seeded rows are indistinguishable from real ones to downstream consumers, except for an audit marker (below).

Why

E2E tests for FGAC need to assert authz outcomes (e.g. denied-get → 404) against real resource rows. A 404 on a missing row doesn't prove the same code path as a 404 on a denied row. Black-box seeding via the natural flow requires a healthy running agent for events, which is heavy/flaky for an e2e suite.

For events specifically, scaleapi/agentex#378 ships a workaround that exploits a quirk of `task_service.create_event_and_forward_to_acp` (writes row before forward). That works for events but isn't a general solution — this endpoint is.

Gate (defense in depth)

Layer Behavior
`ENABLE_TEST_SEEDING` env flag (default `False`) Router not mounted
`ENVIRONMENT != production` hard-check in `app.py` Prod never has the route, regardless of flag
`X-Test-Seed-Token` header (`hmac.compare_digest`) Auth missing/wrong → 404

All gate failures return 404, not 401/403 — the route's existence isn't advertised.

Audit

  • Every seeded event has `{"seeded": true, "seeded_at": }` injected into its `content` payload — downstream filterable.
  • Structured `logger.info("test seeding wrote resource", extra={...})` log on every seed call.

Scope today, extension later

Ships event seeding only. The use case is structured so adding `seed_task` / `seed_api_key` / `seed_schedule` is mechanical — each new type wraps its own branch and (for FGAC-registered types) mirrors the natural flow's `register_resource` call. A `TODO` comment marks that integration point for the next contributor.

Events are intentionally NOT registered as a top-level FGAC resource here — they delegate read authz to the parent agent, matching the natural flow. Verified against `AgentexResourceType` and `routes/events.py`.

Test plan

  • 9 new integration tests in `tests/integration/api/test_test_seeding_api.py`: gate-flag-off, prod-env hard-gate, wrong/missing/unconfigured token, happy path with + without content, audit-marker presence, id override.
  • Full integration API suite (243 tests) green — no regressions.

Files

  • `src/config/environment_variables.py` — `ENABLE_TEST_SEEDING` + `TEST_SEED_TOKEN` config.
  • `src/api/routes/test_seeding.py` — route + discriminated request schema + gate dependency.
  • `src/domain/use_cases/test_seeding_use_case.py` — `TestSeedingUseCase` (event-only today, extensible).
  • `src/api/app.py` — conditional router mount with prod hard-gate.
  • `tests/integration/api/test_test_seeding_api.py` — gate + happy-path coverage.

🤖 Generated with Claude Code

Greptile Summary

Adds a gated POST /test/seed endpoint so the e2e suite can insert resource rows directly — bypassing ACP, worker pickup, and other natural-flow side effects — for FGAC assertion testing. The endpoint is protected by three independent layers: an opt-in env flag, an allow-list environment check that fails closed on prod/unknown environments, and a constant-time shared-secret header check; all failures return 404 to avoid advertising the route.

  • src/api/routes/test_seeding.py — new router with a discriminated request schema (SeedEventRequest) and a _require_test_seeding_enabled dependency that enforces all three gate layers per request.
  • src/domain/use_cases/test_seeding_use_case.pyTestSeedingUseCase.seed_event writes directly via the event repository and injects a {\"seeded\": true, \"seeded_at\": ...} audit marker into every persisted row.
  • src/api/app.py — conditionally mounts the router at startup using the same allow-list check; emits a logger.warning if it mounts so the state is visible in logs.

Confidence Score: 5/5

Safe to merge. The three-layer gate (env flag + allow-list environment check + constant-time token) is well-designed and fails closed; the router is never mounted in production by construction.

The gating logic is careful and consistent between mount-time (app.py) and per-request (test_seeding.py) checks. The test suite covers all gate failure paths and the happy path with audit-marker validation. The two findings are minor edge cases that do not affect the core security properties of the implementation.

No files require special attention. The _ALLOWED_ENVS set is defined in both app.py and test_seeding.py — the in-code comment calls this out — so reviewers should ensure both are updated together if a new non-prod environment is ever added.

Important Files Changed

Filename Overview
agentex/src/api/routes/test_seeding.py New gated POST /test/seed endpoint with discriminated request schema; gate uses allow-list env check + constant-time token comparison; all failure modes return 404.
agentex/src/domain/use_cases/test_seeding_use_case.py New use case that writes event rows directly via the event repository, injects a seeded/seeded_at audit marker, and logs structured metadata. TODO comment is missing a ticket number.
agentex/src/api/app.py Conditionally mounts the test-seeding router using an allow-list environment check at process start; logs a warning if the router is mounted. Straightforward and safe.
agentex/src/config/environment_variables.py Adds ENABLE_TEST_SEEDING (bool, default False) and TEST_SEED_TOKEN (str
agentex/tests/integration/api/test_test_seeding_api.py Nine integration tests covering all gate failure modes plus happy-path and audit-marker assertions; fixture now snapshots and restores dependency_overrides to avoid bleed into other test modules.

Sequence Diagram

sequenceDiagram
    participant E2E as E2E Test Client
    participant App as FastAPI app.py (startup)
    participant Gate as _require_test_seeding_enabled
    participant Route as POST /test/seed
    participant UC as TestSeedingUseCase
    participant Repo as EventRepository

    App->>App: EnvironmentVariables.refresh()
    alt "ENABLE_TEST_SEEDING AND ENVIRONMENT in {DEV, STAGING}"
        App->>App: include_router(test_seeding.router) + logger.warning
    else
        App->>App: router not mounted
    end

    E2E->>Route: POST /test/seed + X-Test-Seed-Token header
    Route->>Gate: Depends(_require_test_seeding_enabled)
    Gate->>Gate: check ENVIRONMENT in allow-list
    Gate->>Gate: check ENABLE_TEST_SEEDING flag
    Gate->>Gate: check TEST_SEED_TOKEN configured
    Gate->>Gate: hmac.compare_digest(header, expected)
    alt any check fails
        Gate-->>E2E: 404 Not Found
    else all checks pass
        Gate-->>Route: proceed
        Route->>UC: seed_event(task_id, agent_id, content, ...)
        UC->>UC: merge content + inject seeded/seeded_at marker
        UC->>Repo: create(id, task_id, agent_id, content_entity)
        Repo-->>UC: EventEntity
        UC->>UC: logger.info(test seeding wrote resource)
        UC-->>Route: EventEntity
        Route-->>E2E: 201 Created (Event schema)
    end
Loading

Fix All in Cursor Fix All in Claude Code Fix All in Codex

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
agentex/src/domain/use_cases/test_seeding_use_case.py:111-119
**TODO missing a linked ticket number**

The comment block starting at this line is a multi-step future obligation (FGAC `register_resource` wiring for tasks, api keys, and schedules). Per the team's convention, TODO comments should include a ticket number so future contributors can discover, prioritise, and track the work. Without one, this is a bookmark that is easy to overlook or lose in a later search.

### Issue 2 of 2
agentex/src/api/routes/test_seeding.py:135
**`hmac.compare_digest` raises `ValueError` for non-ASCII token strings**

Python's `hmac.compare_digest` only accepts ASCII-only `str` arguments; any non-ASCII character raises `ValueError: strings can only contain ASCII characters` rather than returning `False`. If `TEST_SEED_TOKEN` is set to a value that contains non-ASCII bytes (e.g., a raw binary secret pasted from a password manager), the call throws an unhandled exception that propagates as a 500, revealing that the route exists — the opposite of the intended "fail silently with 404" behaviour. Encoding both sides to bytes before comparison eliminates this class of error entirely.

```suggestion
    if not hmac.compare_digest(x_test_seed_token.encode(), expected.encode()):
```

Reviews (4): Last reviewed commit: "fix(test-seeding): allow-list non-prod e..." | Re-trigger Greptile

Context used:

  • Rule used - Include ticket numbers in TODO comments to make cl... (source)

Learned From
scaleapi/scaleapi#126926

  • Rule used - Create Linear tasks for TODO comments in code and ... (source)

Learned From
scaleapi/scaleapi#127117

…eeding

E2E tests for FGAC behavior (scaleapi/agentex#380 and follow-ups) need
to assert authz outcomes against real resource rows — denied-get on a
real event proves the 404 is the authz collapse, not "row doesn't exist."
Today there's no public POST /events; events are persisted by the worker
on ACP stream-back, which makes black-box seeding awkward.

This adds a test-only seeding endpoint at POST /test/seed that writes
resource rows directly via the existing repository layer, bypassing the
natural-flow side effects (ACP forward, etc.).

Defense in depth on the gate:
- `ENABLE_TEST_SEEDING` env flag (default False).
- `ENVIRONMENT != production` hard-check in app.py at router-mount time.
  Prod never has the route at all, regardless of flag.
- `X-Test-Seed-Token` shared-secret header, `hmac.compare_digest`.
- All gate failures return 404 (not 401/403) so the route's existence
  isn't advertised.

Resource types are discriminated; events ship first. The use case is
structured so adding `seed_task` / `seed_api_key` / etc. is mechanical
(includes a TODO for the FGAC register_resource pattern those will need).

Audit:
- Every seeded event has `{"seeded": true, "seeded_at": <iso8601>}`
  injected into its `content` payload — downstream filterable.
- Structured info log per seed call with principal + resource id.

Events are intentionally NOT registered as a top-level FGAC resource
here — they delegate read authz to the parent agent, matching the
natural flow. Verified against AgentexResourceType and routes/events.py.

Tests: 9 new integration tests cover gate-flag-off, prod-env-hard-gate,
wrong/missing/unconfigured token, happy path with + without content,
audit-marker presence, and id override. Full integration API suite
(243 tests) green — no regressions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@dm36 dm36 requested a review from a team as a code owner June 10, 2026 19:07
Comment thread agentex/tests/integration/api/test_test_seeding_api.py
Snapshot the overrides dict at fixture setup, restore in finally. Covers
both the fixture's own injections (TestSeedingUseCase, get_seeding_env_vars)
AND any mid-test mutations from `_override_env`. Without this, entries
leak into fastapi_app's global state and can bleed into other test modules
that share the same app instance.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread agentex/src/api/app.py Outdated
if (
_test_seeding_env_vars is not None
and _test_seeding_env_vars.ENABLE_TEST_SEEDING
and _test_seeding_env_vars.ENVIRONMENT != Environment.PROD

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from Claude, but seems legit.

The prod "hard gate" is a deny-list on an unvalidated string: ENVIRONMENT is typed str | None and populated raw from os.environ (no enum validation), so unset/None, "prod", "Production", or any typo passes != Environment.PROD. The layer described as the strongest guarantee only holds if prod sets exactly "production". Invert to an allow-list so unknown environments fail closed: if env_vars.ENVIRONMENT not in (Environment.DEV, Environment.STAGING): raise not_found - same change at the mount site in app.py.

Comment on lines +34 to +36
class TestSeedingUseCase: # noqa: PT001 — not a pytest class; "Test" prefix is the use-case domain name
__test__ = False # tell pytest not to collect this as a test class
"""Test-only resource seeding.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python, only a string literal that is the first statement in the class body is treated as a docstring. Because __test__ = False appears first, this multi-line string is just a discarded expression — TestSeedingUseCase.__doc__ will be None. Swapping the two lines fixes it without any functional change.

Suggested change
class TestSeedingUseCase: # noqa: PT001 — not a pytest class; "Test" prefix is the use-case domain name
__test__ = False # tell pytest not to collect this as a test class
"""Test-only resource seeding.
class TestSeedingUseCase: # noqa: PT001 — not a pytest class; "Test" prefix is the use-case domain name
"""Test-only resource seeding.

dm36 and others added 2 commits June 10, 2026 15:13
…s populated

Python only treats a string literal as __doc__ when it is the FIRST
statement in the class body. The previous order put __test__ first, which
silently turned the docstring into a discarded expression — __doc__ was
None. Swap the two: docstring first, __test__ second.

Verified TestSeedingUseCase.__doc__ is now populated and pytest still
skips collection (__test__ = False as a class attribute is honored
regardless of where it sits in the body).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…g PROD

Address review (harvhan): ENVIRONMENT is typed `str | None` on
EnvironmentVariables and populated raw from os.environ with no enum
coercion. A deny-list against `Environment.PROD` ("production") therefore
fails OPEN on any value the gate doesn't recognize — unset, "prod",
"Production", typos, or any new environment name.

Switch both gate sites (router mount in app.py + per-request check in
the seeding route) to an allow-list against Environment.DEV ("development")
and Environment.STAGING. Anything else fails closed.

Tests: extend the gate suite with a parametrized `test_unknown_environment
_returns_404` covering None, "", "prod", "Production", "dev", "qa". 15
seeding tests now pass (was 9). Existing `test_prod_env_returns_404...`
left in place since prod IS one of the rejected cases — the new test
covers the broader allow-list contract.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment on lines +166 to +175
if body.resource_type == "event":
payload = body.payload
event_entity = await use_case.seed_event(
task_id=str(payload.task_id),
agent_id=str(payload.agent_id),
content=payload.content,
id_override=str(payload.id) if payload.id is not None else None,
principal_id=principal_id,
)
return Event.model_validate(event_entity)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I get the overall idea here.

I dont think seeding just event is enough because event permissions depend on the agent right? If we want to just seed the events why not just mock out the DB instead of creating a whole endpoint to create DB entries?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what is the eventual intention of these tests we are writing? Do we intend them to run on each PR/each deployment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants