feat: autonomous PR-watcher (Phase C)#29
Merged
Conversation
Background asyncio loop in the platform's FastAPI lifespan that walks running code-review-engineer agents × their watched_repos, fetches open PRs from GitHub, dedups against reviewed_prs, and dispatches a review task — the last remaining piece of the Code Review Engineer epic. - watched_repos table (#28: unique(user_id, owner, repo) prevents two agents under the same user from watching the same repo and posting duplicate reviews) - WatchedRepoModel translates the PG unique-constraint violation into WatchedRepoExists so the router can return 409 - watched_repos router: POST/GET/DELETE /agents/{id}/watched-repos under user-side X-Api-Key auth - PRWatcher service: 120s cadence, per-(agent,repo) error isolation, 409-from-sidecar treated as "agent busy, defer to next tick", startup-only 30-min staleness gate to avoid backlog-flooding on platform restart - main.py lifespan starts/cancels the watcher; PR_WATCHER_ENABLED=false opt-out for tests and ad-hoc debug runs - 17 new tests (125 → 142, all passing) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
michaelzwang13
commented
May 24, 2026
Owner
Author
michaelzwang13
left a comment
There was a problem hiding this comment.
Code Review Summary
PR #29: autonomous PR-watcher (Phase C)
This PR implements the final piece of the Code Review Engineer epic - a background asyncio loop that polls GitHub for open PRs and dispatches review tasks to running agents.
Review
Strengths:
- Clean architecture with proper separation of concerns
- Good error isolation (per-repo failures do not break the loop)
- Smart deduplication via unique(user_id, owner, repo) constraint prevents double reviews (#28)
- Proper lifespan management with graceful shutdown
- 30-minute startup staleness gate prevents backlog flooding on platform restart
- 409 handling for busy agents is thoughtful
- Good test coverage (17 new tests, 125 to 142)
Minor notes:
- The broad except Exception with string matching for duplicate key errors in watched_repo.py is a bit fragile, but acceptable given Supabase client limitations
- Consider adding a TypedDict for list_running_by_role() return type for better type safety
Overall: Well-structured implementation with clean code, proper documentation, and thoughtful edge case handling. LGTM!
This was referenced May 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #9. Closes #28.
The last remaining piece of the Code Review Engineer epic (#10). A background
asyncioloop in the platform's FastAPI lifespan now walks runningcode-review-engineeragents × theirwatched_repos, fetches open PRs from GitHub, dedups againstreviewed_prs(Phase D), and dispatches a review task. The agent's LLM then decides to invokegithub-pr-reviewfrom inside its container — the watcher is the trigger, not the bookkeeper.What changed
New table —
watched_repos(backend/migrations/004_code_review_engineer.sql, mirrored inschema.sql)unique(user_id, owner, repo)enforces Prevent double-review when multiple agents watch the same repo #28: two agents under the same user cannot watch the same repo. Without it, both would dispatch reviews of the same PR and the user pays 2× tokens.New model —
WatchedRepoModel(backend/app/models/watched_repo.py)create/list_by_agent/list_all/get_by_id/delete.createtranslates the PG unique-constraint violation intoWatchedRepoExistsso the router can return 409.New router —
watched_repos(backend/app/routers/watched_repos.py)POST /agents/{id}/watched-repos(201) / 409 on cross-agent conflictGET /agents/{id}/watched-reposDELETE /agents/{id}/watched-repos/{watched_id}X-Api-Keyauth, scoped to caller's agents (404 if not owned).New service —
PRWatcher(backend/app/services/pr_watcher.py)Lifespan wiring (
backend/app/main.py)lifespanhandler starts the watcher task on boot, cancels it cleanly on shutdown.PR_WATCHER_ENABLED=falseopts out — used by the test suite (conftest.py) soTestClient(app)doesn't spin up a real background loop.Memory injection rides for free —
Dispatcher.dispatch_taskalready injectsagent_memoryintorole_context(Phase D). Every watcher-triggered review gets current memory by reusing that path.Why a single global loop (not per-agent)
One
asyncio.Taskiterating across all agents serially is plenty for hackathon scale. Per-agent watchers would multiply the loop count with no real benefit, and complicate the lifespan teardown.Why dedup stays at the agent level
reviewed_prsis stillunique(agent_id, owner, repo, pr_number). #28 is prevented at thewatched_reposinsert, so by the time a PR is being reviewed, only one agent ever sees it. If we ever allow multi-agent overlap (we won't soon), we'd revisitreviewed_prsthen — not now.Tests
17 new — 125 → 142 total, all passing.
test_watched_repos.py(9): router success/auth/404/409 paths, model conflict translation.test_pr_watcher.py(8): dispatch-on-fresh, dedup-skip, 409-defer, per-repo isolation, startup staleness, steady-state ignores age, no-agents short-circuit, clean cancel.Smoke test (against live backend)
Verified end-to-end via
/tmp/phaseC_smoke.py:PRWatcher.tick()walks the real DB, hits the missing-GitHub-cred case, per-repo isolation swallows it — tick completes cleanlyMigration
Same
004_code_review_engineer.sqlcontinues to absorb the Code Review Engineer epic (Phases B, D, now C). Idempotent.Deliberately out of scope
Test plan
pytestpasses (142/142)pr_watcher: dispatched review …🤖 Generated with Claude Code