Skip to content

Support pydantic-ai 2.0 in Python SDK integration #536

Description

Summary

The weekly dependency update bumps pydantic-ai from 1.107.0 to 2.0.0. Before refreshing py/src/braintrust/integrations/pydantic_ai/cassettes/latest, the Braintrust Pydantic AI integration needs a few compatibility fixes for Pydantic AI v2 behavior/API changes.

Background

Pydantic AI v2.0.0 is a major release. The relevant upstream changes for our integration are:

  • Agent.run_stream_events() is now an async context manager rather than directly async-iterable.
  • Direct sync streaming exposes StreamedResponseSync.response as the final/current response property; our code still calls .get() in one path.
  • The default openai: model prefix now uses the OpenAI Responses API instead of Chat Completions.
  • v2 response/message objects include useful new fields such as state, provider_name, provider_url, finish_reason, run_id, and conversation_id.

Confirmed failures / observations

1. run_stream_events() wrapper breaks on v2

Command:

cd py
nox -s "test_pydantic_ai_integration(latest)"

Observed failure:

TypeError: 'async for' requires an object with __aiter__ method, got _AsyncGeneratorContextManager

Relevant code:

  • py/src/braintrust/integrations/pydantic_ai/tracing.py
    • _agent_run_stream_events_wrapper() currently does async for event in wrapped(*args, **kwargs): ...

Pydantic AI v2 requires:

async with agent.run_stream_events(...) as events:
    async for event in events:
        ...

The fix should preserve compatibility with older supported versions (1.10.0).

2. Direct sync stream final response extraction should support .response

In v2, pydantic_ai.direct.StreamedResponseSync exposes a response property. Our _DirectStreamWrapperSync.__exit__() calls:

final_response = self.stream.get()

This can silently fail and skip stream output logging because exceptions are caught and debug-logged. Add a compatibility helper that prefers .response and falls back to .get() for older versions.

3. Existing latest cassettes are stale for v2 default OpenAI behavior

Current latest cassettes contain OpenAI Chat Completions requests:

POST /v1/chat/completions

With Pydantic AI v2, the same openai:gpt-4o-mini tests now issue:

POST /v1/responses

Playback fails with VCR mismatches like:

No match for the request (<Request (POST) https://api.openai.com/v1/responses>) was found.
Found 1 similar requests ... /v1/chat/completions

Cassette refresh is therefore justified after the SDK compatibility fixes land.

4. Internal deprecation warning noise

Non-VCR latest tests pass, but emit repeated warnings from our own integration path:

DeprecationWarning: wrap_model_class() is deprecated and no longer needed for normal setup.

This comes from _wrap_model_instance() calling the public deprecated wrap_model_class() helper internally. Internal setup should use a no-warning/private helper while preserving the public deprecation warning for user calls.

Command that passed except for warning noise:

cd py
nox -s "test_pydantic_ai_integration(latest)" -- -m "not vcr"

Result: 19 passed, 41 deselected.

5. Consider improving v2 trace shape

Not a hard failure, but v2 adds useful fields that our shaping currently omits:

  • ModelRequest.state
  • ModelResponse.state
  • provider_name
  • provider_url
  • finish_reason
  • run_id
  • conversation_id

Relevant constants:

  • _MESSAGE_FIELDS
  • _RESPONSE_FIELDS

Also consider updating provider inference for v2 model class names such as OpenAIResponsesModel, GoogleModel, GoogleCloud, and XaiModel.

Validation notes

I checked these sessions against the dependency-update branch:

cd py
nox -s "test_pydantic_ai_integration(latest)" -- -m "not vcr"
nox -s "test_pydantic_ai_integration(latest)"
nox -s "test_pydantic_ai_wrap_openai(latest)"
nox -s "test_pydantic_ai_logfire(latest)"

Observed:

  • test_pydantic_ai_integration(latest) -- -m "not vcr": passes; warning noise only.
  • test_pydantic_ai_integration(latest): fails due to stale cassettes and the run_stream_events() API change.
  • test_pydantic_ai_wrap_openai(latest): passes as-is; this path explicitly uses OpenAIChatModel, so it stays on chat-completions.
  • test_pydantic_ai_logfire(latest): fails due to stale cassette (/v1/responses vs /v1/chat/completions).

Suggested implementation checklist

  • Update _agent_run_stream_events_wrapper() to handle both directly async-iterable and async context-manager return values.
  • Update test_agent_run_stream_events or add a focused regression test for v2 context-manager behavior while preserving old-version coverage.
  • Add a helper for final stream response extraction (.response first, .get() fallback) and use it in direct stream wrappers.
  • Avoid internal calls to deprecated wrap_model_class() that emit warnings during normal setup.
  • Optionally expand _MESSAGE_FIELDS / _RESPONSE_FIELDS with v2 fields and improve model provider inference.
  • After code fixes, delete and refresh py/src/braintrust/integrations/pydantic_ai/cassettes/latest once, then run the shared-dir sessions serially:
    • nox -s "test_pydantic_ai_integration(latest)" -- --vcr-record=all
    • nox -s "test_pydantic_ai_logfire(latest)" -- --vcr-record=all
    • nox -s "test_pydantic_ai_wrap_openai(latest)" -- --vcr-record=all (only if deleting shared dir removes its cassettes)
  • Validate playback for all three sessions.

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions