Skip to content

JVNAUTOSCI-1205: turn execution coverage, synthesis backfill, and write-path guardrails#106

Merged
witbrock merged 11 commits intomainfrom
JVNAUTOSCI-1205-turn-execution-coverage
Feb 19, 2026
Merged

JVNAUTOSCI-1205: turn execution coverage, synthesis backfill, and write-path guardrails#106
witbrock merged 11 commits intomainfrom
JVNAUTOSCI-1205-turn-execution-coverage

Conversation

@witbrock
Copy link
Member

Summary

This PR delivers deterministic turn-execution coverage instrumentation and recovery for historical and forward chat turns.

It includes:

  • namespace-level coverage diagnostics for chat-history vs projection quality
  • synthesis-capable historical backfill from legacy llm_debug_data
  • workflow-routing inference during synthesis so projected records are workflow-attributed
  • forward write-path guardrail in add_message_to_history(...) to synthesise and persist missing turn_execution_record payloads
  • stdio/catalogue/manifest exposure and regression tests

Key changes

  • Added turn_execution_namespace_coverage_report MCP read tool.
  • Extended turn_execution_backfill_from_chat_history with synthesise_missing_records (default true).
  • Added shared routing inference (infer_turn_execution_workflow_routing_from_debug) and applied it to backfill + forward write synthesis.
  • Hardened chat_history_service.add_message_to_history(...) to self-heal missing assistant turn execution records and keep projection aligned.
  • Updated docs in docs/engineering/turn_execution_completion_schema_proposal.md.

Validation

  • pyright on changed Python files: 0 errors.
  • Targeted pytest slices passed repeatedly (32, 29, 26, 21, 10 pass runs across impacted suites).
  • Live test-db canary for forward writes:
    • assistant message stored with embedded turn_execution_record
    • projection row present
    • inferred selected_workflow_id=#V#tool_calling_workflow

Operational evidence

Primary namespace: #V#michael_witbrock@university_of_auckland_strong_ai_lab

Historical recovery:

  • dry-run synthesis: candidate_records=728
  • execute backfill: inserted_count=728, then idempotent rerun inserted_count=0, upserted_count=728

Post-recovery coverage:

  • projected_records_total=728
  • request_id_overlap_rate_pct=100.0
  • projected_records_missing_selected_workflow_id=0

Benchmark readiness:

  • scanned_count=200
  • likely_failure_count=146
  • workflow breakdown now available:
    • #V#tool_calling_workflow=172
    • #V#chat_assistant_workflow=28

Jira

  • JVNAUTOSCI-1205
  • Related: JVNAUTOSCI-1204, JVNAUTOSCI-1203

@witbrock witbrock force-pushed the JVNAUTOSCI-1205-turn-execution-coverage branch from 3aefada to f4d7c69 Compare February 19, 2026 19:59
@witbrock witbrock merged commit 3176e48 into main Feb 19, 2026
1 check passed
@witbrock witbrock deleted the JVNAUTOSCI-1205-turn-execution-coverage branch February 19, 2026 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments