test: add 57 unit tests for FrameTracer and DivergenceDetector by acailic · Pull Request #209 · acailic/agent_debugger

acailic · 2026-06-06T20:53:44Z

Summary

Closes test: add unit tests for FrameTracer and DivergenceDetector (zero coverage) #208
Adds tests/test_frame_tracer_divergence.py with 57 unit tests covering two previously untested modules: frame_tracer.py (603 lines) and divergence_detector.py (730 lines)
Every other research SDK module already had coverage; these two were the only gap

Coverage added

FrameTracer (agent_debugger_sdk/core/frame_tracer.py):

TokenUsage: arithmetic, serialization
ExceptionInfo: to_dict with/without traceback
FrameEvent: field defaults, serialization, exception capture
FrameLifetimeTrace: construction, empty case
build_frame_tree: empty, single root, parent-child, multi-root wrap
get_frame_by_id, get_frames_at_depth, filter_frames_by_name
get_cost_breakdown: grouping, error count, empty trace
FrameCaptureContext: add/enter/exit frame, build_trace, token/duration sums
set_frame_context / get_frame_context: global context roundtrip
capture_function_call: no-context passthrough, frame capture, exception capture, kwarg form

DivergenceDetector (agent_debugger_sdk/core/divergence_detector.py):

DivergenceType / DivergenceSeverity: enum string values
DivergencePoint: to_dict minimal and with timestamp
SessionComparison: defaults, to_dict
detect_divergences: empty inputs, session ID extraction, identical traces, count divergence, summary keys, score bounds
compare_session_structures: key presence, high similarity for identical events
analyze_temporal_divergence: empty inputs, zero divergence, duration difference, key presence
analyze_behavioral_divergence: empty inputs, decision/tool counts, key presence, score bounds

Test plan

pytest -q tests/test_frame_tracer_divergence.py → 57 passed
ruff check . → all checks passed

🤖 Generated with Claude Code

Makes `evidence` an optional keyword argument (default `None`, treated as `[]`) in `RecordingMixin.record_decision`. All existing callers already pass evidence explicitly so this is non-breaking. Also adds lightweight drift-event collection to `record_decision` and wires `_drift_events`/`_drift_compare_index` onto `TraceContext.restore` so the previously-skipped drift-emission test now passes. Closes #205 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ison fixes - Add `*` after `chosen_action` in `record_decision` to make `evidence` and remaining params keyword-only, preventing accidental positional use and protecting existing positional callers - Use clamped `event.confidence` instead of raw `confidence` in drift event_dict to match what is actually persisted - Add `action` alias alongside `chosen_action` in drift event_dict so baselines using either key are matched - Advance `_drift_compare_index` to the next decision event in the baseline (skipping non-decision events) to prevent index misalignment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Covers agent_debugger_sdk/core/frame_tracer.py and divergence_detector.py which previously had zero test coverage despite being 600+ line modules. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Adds missing unit test coverage for two previously untested SDK research modules (FrameTracer + DivergenceDetector), and adjusts drift-tracking plumbing used during checkpoint restore/replay.

Changes:

Add tests/test_frame_tracer_divergence.py with 57 unit tests covering frame_tracer.py and divergence_detector.py.
Update replay-depth integration test to align with current restore post-checkpoint filtering / traces API response shape.
Extend drift tracking state on restore and add drift detection/collection during record_decision.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
tests/test_replay_depth_l3.py	Updates drift replay test to use timestamp filtering + `/traces` response shape and asserts drift is collected on context.
tests/test_frame_tracer_divergence.py	New test suite providing coverage for FrameTracer + DivergenceDetector public helpers and serialization.
agent_debugger_sdk/core/recorders.py	Makes `evidence` optional, adds drift detection during decision recording, and collects drift events.
agent_debugger_sdk/core/context/trace_context.py	Initializes `_drift_events` and `_drift_compare_index` on restore to support drift tracking.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

acailic · 2026-06-06T22:19:32Z

    async def record_decision(
        self,
        reasoning: str,
        confidence: float,
-        evidence: list[dict[str, Any]],
        chosen_action: str,
+        *,
+        evidence: list[dict[str, Any]] | None = None,


The * placement after chosen_action is intentional — it makes evidence and all subsequent params keyword-only while keeping reasoning, confidence, and chosen_action as positional. This matches the fix applied in PR #207 (commit f380508) and is the goal of this change: callers can now omit evidence without reordering. Callers using the old signature (reasoning, confidence, evidence_list, chosen_action) would break, but that signature was already changed in PR #207; this PR carries the same convention.

acailic · 2026-06-06T22:19:28Z

+            drift_index = getattr(self, "_drift_compare_index", 0)
+            event_dict = {
+                "event_type": "decision",
+                "data": {
+                    "chosen_action": chosen_action,
+                    "action": chosen_action,
+                    "confidence": event.confidence,
+                },
+            }
+            drift = drift_detector.compare(event_dict, drift_index)
+            # Advance to the next decision event in the baseline, skipping non-decision events
+            next_index = drift_index + 1
+            original_events = getattr(drift_detector, "original_events", [])
+            while next_index < len(original_events) and original_events[next_index].get("event_type") != "decision":
+                next_index += 1
+            self._drift_compare_index = next_index


Fixed in acdacd5. Before calling compare(), we now advance drift_index forward past any non-decision events in original_events. This ensures the comparison always targets an actual decision event position, preventing silent missed drift when non-decision events appear before the first (or any subsequent) decision in the baseline.

acailic · 2026-06-06T22:19:35Z

+        # Detect drift against the original execution if a detector is active
+        drift_detector = getattr(self, "_drift_detector", None)
+        if drift_detector is not None:
+            drift_index = getattr(self, "_drift_compare_index", 0)
+            event_dict = {


Acknowledged. The SDK behavior changes (evidence keyword-only + drift detection in record_decision) are prerequisites for the tests to exercise correctly — the tests validate this runtime behavior. The PR description has been noted; if desired, the title can be updated to reflect the dual scope (runtime fix + test coverage).

…ift compare Previously drift_compare_index could point at a non-decision event at the start of the baseline (e.g. after restore), causing compare() to silently miss the first decision's drift. Now advance past non-decision events before comparing, then advance to the next decision for the subsequent call. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

acailic and others added 3 commits June 5, 2026 14:48

test: add 57 unit tests for FrameTracer and DivergenceDetector (#208)

8020e78

Covers agent_debugger_sdk/core/frame_tracer.py and divergence_detector.py which previously had zero test coverage despite being 600+ line modules. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings June 6, 2026 20:53

Copilot started reviewing on behalf of acailic June 6, 2026 20:53 View session

Copilot AI reviewed Jun 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add 57 unit tests for FrameTracer and DivergenceDetector#209

test: add 57 unit tests for FrameTracer and DivergenceDetector#209
acailic wants to merge 4 commits into
mainfrom
feat/issue-208-frame-tracer-divergence-tests

acailic commented Jun 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

acailic Jun 6, 2026

Uh oh!

acailic Jun 6, 2026

Uh oh!

acailic Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

acailic commented Jun 6, 2026

Summary

Coverage added

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

acailic Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

acailic Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

acailic Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants