Skip to content

feat(observability): agent run transcript — backend (#143, ADR-0034)#162

Merged
stephane-segning merged 3 commits into
mainfrom
claude/143-transcript-backend
Jun 22, 2026
Merged

feat(observability): agent run transcript — backend (#143, ADR-0034)#162
stephane-segning merged 3 commits into
mainfrom
claude/143-transcript-backend

Conversation

@stephane-segning

Copy link
Copy Markdown
Contributor

1. Summary

The backend half of #143 / ADR-0034: persist the agent run transcript (tool calls, reasoning, token usage) so a run is inspectable — why the review said what it did. UI deferred (backend-only, per direction).

  • Runner: the chat Completion now carries token usage; run_native_agent fills a transcript Vec the caller owns and submits best-effort at run end, success or failure (a failed run's reasoning is the most useful). Records assistant turns (content + tool_calls + per-turn tokens) and tool results (truncated to 2 KiB).
  • Control-plane: migration 0014 agent_transcript; replace_transcript (one transaction — a retry re-submit fully replaces) + get_transcript; ingest POST /internal/tasks/{id}/transcript (runner auth); read GET /tasks/{id}/transcript (gated review:read) for the future dashboard timeline.

Source of truth: #143 / ADR-0034.

Merge order: migration is 0014 to sit after #144 (#161)'s 0013merge #161 first so prod's migration sequence has no gap.


2. Intent

When a review is weak (the live concern), you need to see the agent's reasoning + what it searched + tokens spent to know why. This captures that trail and exposes it via an API; it also seeds the eval dataset (pairs with #144 feedback).


3. Scope

In Scope

  • Capture (runner) + store + ingest/read API (control-plane).

Out of Scope

  • The apps/web dashboard timeline (deferred — backend-only pass).

4. Verification

  • Running automated tests
cargo test -p agent-runner --lib
DATABASE_URL=postgres://lightbridge:lightbridge@localhost:5432/lightbridge cargo test -p control-plane --locked
cargo fmt --check && cargo clippy -p control-plane -p agent-runner --all-targets
agent-runner: 30 passed (loop test now also asserts the transcript captured assistant turns)
control-plane: 58 passed (transcript_replace_and_read: order + replace-on-retry)
fmt + clippy: clean

5. Screenshots / Evidence

N/A (backend). The read API returns the ordered transcript; the timeline UI is the deferred follow-up.


6. Risk Assessment

  • Low — additive table + endpoints; transcript submission is best-effort (never fails a task); read API gated review:read.

Note: tool results are truncated to 2 KiB/row to bound table growth.


7. AI Usage Declaration

AI was used for:

  • Understanding existing code
  • Generating code
  • Generating tests

Human verification:

  • I understand every meaningful change
  • I checked the txn replace + best-effort submit semantics
  • I accept responsibility for this PR

8. Reviewer Focus

🤖 Generated with Claude Code

… ADR-0034)

The runner records its run transcript (assistant reasoning + tool calls +
per-turn token usage, and bounded tool results) and submits it to the
control plane at run end — even on failure, so a failed run's reasoning is
inspectable. The control plane stores it and serves a read API for the
(future) dashboard timeline. UI deferred per scope.

- runner: chat Completion now carries token `usage`; run_native_agent fills
  a transcript Vec the caller owns + submits best-effort (client.submit_
  transcript); tool results truncated to 2 KiB.
- control-plane: migration 0014 agent_transcript; replace_transcript (txn:
  a retry re-submit fully replaces) + get_transcript; ingest endpoint
  POST /internal/tasks/{id}/transcript; read GET /tasks/{id}/transcript
  (gated review:read).

NOTE: migration numbered 0014 to sit after #144's 0013 — merge #161 first
so prod's migration sequence has no gap.

agent-runner 30 + control-plane 58 (pg17, incl. transcript test) green;
fmt + clippy clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown

✅ AI Governance check passed

This PR declares AI usage, references a source of truth, and provides verification evidence. Thank you.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements ADR-0034 to persist and retrieve agent run transcripts (including tool calls, reasoning, and token usage) for observability. It adds a database migration, implements the database operations to replace and fetch transcripts, and exposes the corresponding internal ingest and public query HTTP endpoints. Feedback was provided to check for task existence in the ingest endpoint to return a proper 404 error instead of a 500 error caused by a foreign key constraint violation.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +194 to +197
let Some(pool) = state.db.as_ref() else {
return (StatusCode::SERVICE_UNAVAILABLE, "no database").into_response();
};
match crate::db::replace_transcript(pool, id, &submission.entries).await {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In ingest_transcript, the code directly calls replace_transcript without verifying if the task actually exists. If the task does not exist, the database will throw a foreign key constraint violation because agent_transcript.task_id references tasks.id. This results in a 500 Internal Server Error instead of a proper 404 Not Found response.

We should check if the task exists first, similar to how it is done in ingest_chunks and ingest_graph.

    let Some(pool) = state.db.as_ref() else {
        return (StatusCode::SERVICE_UNAVAILABLE, "no database").into_response();
    };
    let exists: bool = match sqlx::query_scalar("SELECT EXISTS(SELECT 1 FROM tasks WHERE id = $1)")
        .bind(id)
        .fetch_one(pool)
        .await
    {
        Ok(val) => val,
        Err(error) => {
            tracing::error!(%error, task_id = %id, "checking task existence failed");
            return (StatusCode::INTERNAL_SERVER_ERROR, "query error").into_response();
        }
    };
    if !exists {
        return (StatusCode::NOT_FOUND, "task not found").into_response();
    }
    match crate::db::replace_transcript(pool, id, &submission.entries).await {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — ingest_transcript now resolves the task first (SELECT id) and returns 404 for an unknown id, mirroring ingest_chunks/ingest_graph; no more FK-violation 500.

stephane-segning and others added 2 commits June 22, 2026 22:38
Resolve the task before replace_transcript so an unknown id is a clean 404
instead of a foreign-key 500 on insert — mirrors ingest_chunks/ingest_graph.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-backend

# Conflicts:
#	services/agent-runner/src/main.rs
#	services/agent-runner/src/review/native/agent.rs
@stephane-segning stephane-segning merged commit 38a803c into main Jun 22, 2026
7 checks passed
@stephane-segning stephane-segning deleted the claude/143-transcript-backend branch June 22, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant