Summary
The weekly dependency update bumps pydantic-ai from 1.107.0 to 2.0.0. Before refreshing py/src/braintrust/integrations/pydantic_ai/cassettes/latest, the Braintrust Pydantic AI integration needs a few compatibility fixes for Pydantic AI v2 behavior/API changes.
Background
Pydantic AI v2.0.0 is a major release. The relevant upstream changes for our integration are:
Agent.run_stream_events() is now an async context manager rather than directly async-iterable.
- Direct sync streaming exposes
StreamedResponseSync.response as the final/current response property; our code still calls .get() in one path.
- The default
openai: model prefix now uses the OpenAI Responses API instead of Chat Completions.
- v2 response/message objects include useful new fields such as
state, provider_name, provider_url, finish_reason, run_id, and conversation_id.
Confirmed failures / observations
1. run_stream_events() wrapper breaks on v2
Command:
cd py
nox -s "test_pydantic_ai_integration(latest)"
Observed failure:
TypeError: 'async for' requires an object with __aiter__ method, got _AsyncGeneratorContextManager
Relevant code:
py/src/braintrust/integrations/pydantic_ai/tracing.py
_agent_run_stream_events_wrapper() currently does async for event in wrapped(*args, **kwargs): ...
Pydantic AI v2 requires:
async with agent.run_stream_events(...) as events:
async for event in events:
...
The fix should preserve compatibility with older supported versions (1.10.0).
2. Direct sync stream final response extraction should support .response
In v2, pydantic_ai.direct.StreamedResponseSync exposes a response property. Our _DirectStreamWrapperSync.__exit__() calls:
final_response = self.stream.get()
This can silently fail and skip stream output logging because exceptions are caught and debug-logged. Add a compatibility helper that prefers .response and falls back to .get() for older versions.
3. Existing latest cassettes are stale for v2 default OpenAI behavior
Current latest cassettes contain OpenAI Chat Completions requests:
POST /v1/chat/completions
With Pydantic AI v2, the same openai:gpt-4o-mini tests now issue:
Playback fails with VCR mismatches like:
No match for the request (<Request (POST) https://api.openai.com/v1/responses>) was found.
Found 1 similar requests ... /v1/chat/completions
Cassette refresh is therefore justified after the SDK compatibility fixes land.
4. Internal deprecation warning noise
Non-VCR latest tests pass, but emit repeated warnings from our own integration path:
DeprecationWarning: wrap_model_class() is deprecated and no longer needed for normal setup.
This comes from _wrap_model_instance() calling the public deprecated wrap_model_class() helper internally. Internal setup should use a no-warning/private helper while preserving the public deprecation warning for user calls.
Command that passed except for warning noise:
cd py
nox -s "test_pydantic_ai_integration(latest)" -- -m "not vcr"
Result: 19 passed, 41 deselected.
5. Consider improving v2 trace shape
Not a hard failure, but v2 adds useful fields that our shaping currently omits:
ModelRequest.state
ModelResponse.state
provider_name
provider_url
finish_reason
run_id
conversation_id
Relevant constants:
_MESSAGE_FIELDS
_RESPONSE_FIELDS
Also consider updating provider inference for v2 model class names such as OpenAIResponsesModel, GoogleModel, GoogleCloud, and XaiModel.
Validation notes
I checked these sessions against the dependency-update branch:
cd py
nox -s "test_pydantic_ai_integration(latest)" -- -m "not vcr"
nox -s "test_pydantic_ai_integration(latest)"
nox -s "test_pydantic_ai_wrap_openai(latest)"
nox -s "test_pydantic_ai_logfire(latest)"
Observed:
test_pydantic_ai_integration(latest) -- -m "not vcr": passes; warning noise only.
test_pydantic_ai_integration(latest): fails due to stale cassettes and the run_stream_events() API change.
test_pydantic_ai_wrap_openai(latest): passes as-is; this path explicitly uses OpenAIChatModel, so it stays on chat-completions.
test_pydantic_ai_logfire(latest): fails due to stale cassette (/v1/responses vs /v1/chat/completions).
Suggested implementation checklist
Summary
The weekly dependency update bumps
pydantic-aifrom1.107.0to2.0.0. Before refreshingpy/src/braintrust/integrations/pydantic_ai/cassettes/latest, the Braintrust Pydantic AI integration needs a few compatibility fixes for Pydantic AI v2 behavior/API changes.Background
Pydantic AI v2.0.0 is a major release. The relevant upstream changes for our integration are:
Agent.run_stream_events()is now an async context manager rather than directly async-iterable.StreamedResponseSync.responseas the final/current response property; our code still calls.get()in one path.openai:model prefix now uses the OpenAI Responses API instead of Chat Completions.state,provider_name,provider_url,finish_reason,run_id, andconversation_id.Confirmed failures / observations
1.
run_stream_events()wrapper breaks on v2Command:
Observed failure:
Relevant code:
py/src/braintrust/integrations/pydantic_ai/tracing.py_agent_run_stream_events_wrapper()currently doesasync for event in wrapped(*args, **kwargs): ...Pydantic AI v2 requires:
The fix should preserve compatibility with older supported versions (
1.10.0).2. Direct sync stream final response extraction should support
.responseIn v2,
pydantic_ai.direct.StreamedResponseSyncexposes aresponseproperty. Our_DirectStreamWrapperSync.__exit__()calls:This can silently fail and skip stream output logging because exceptions are caught and debug-logged. Add a compatibility helper that prefers
.responseand falls back to.get()for older versions.3. Existing latest cassettes are stale for v2 default OpenAI behavior
Current latest cassettes contain OpenAI Chat Completions requests:
With Pydantic AI v2, the same
openai:gpt-4o-minitests now issue:Playback fails with VCR mismatches like:
Cassette refresh is therefore justified after the SDK compatibility fixes land.
4. Internal deprecation warning noise
Non-VCR latest tests pass, but emit repeated warnings from our own integration path:
This comes from
_wrap_model_instance()calling the public deprecatedwrap_model_class()helper internally. Internal setup should use a no-warning/private helper while preserving the public deprecation warning for user calls.Command that passed except for warning noise:
Result:
19 passed, 41 deselected.5. Consider improving v2 trace shape
Not a hard failure, but v2 adds useful fields that our shaping currently omits:
ModelRequest.stateModelResponse.stateprovider_nameprovider_urlfinish_reasonrun_idconversation_idRelevant constants:
_MESSAGE_FIELDS_RESPONSE_FIELDSAlso consider updating provider inference for v2 model class names such as
OpenAIResponsesModel,GoogleModel,GoogleCloud, andXaiModel.Validation notes
I checked these sessions against the dependency-update branch:
Observed:
test_pydantic_ai_integration(latest) -- -m "not vcr": passes; warning noise only.test_pydantic_ai_integration(latest): fails due to stale cassettes and therun_stream_events()API change.test_pydantic_ai_wrap_openai(latest): passes as-is; this path explicitly usesOpenAIChatModel, so it stays on chat-completions.test_pydantic_ai_logfire(latest): fails due to stale cassette (/v1/responsesvs/v1/chat/completions).Suggested implementation checklist
_agent_run_stream_events_wrapper()to handle both directly async-iterable and async context-manager return values.test_agent_run_stream_eventsor add a focused regression test for v2 context-manager behavior while preserving old-version coverage..responsefirst,.get()fallback) and use it in direct stream wrappers.wrap_model_class()that emit warnings during normal setup._MESSAGE_FIELDS/_RESPONSE_FIELDSwith v2 fields and improve model provider inference.py/src/braintrust/integrations/pydantic_ai/cassettes/latestonce, then run the shared-dir sessions serially:nox -s "test_pydantic_ai_integration(latest)" -- --vcr-record=allnox -s "test_pydantic_ai_logfire(latest)" -- --vcr-record=allnox -s "test_pydantic_ai_wrap_openai(latest)" -- --vcr-record=all(only if deleting shared dir removes its cassettes)