Skip to content

OpenAI agents: rewrite to util-genai#90

Draft
lmolkova wants to merge 4 commits into
open-telemetry:mainfrom
lmolkova:openai-agents-rewrite-to-utils
Draft

OpenAI agents: rewrite to util-genai#90
lmolkova wants to merge 4 commits into
open-telemetry:mainfrom
lmolkova:openai-agents-rewrite-to-utils

Conversation

@lmolkova
Copy link
Copy Markdown
Member

@lmolkova lmolkova commented May 23, 2026

Fix #86

This PR is stacked on #85 and #89

OpenAI agents instrumentation is based on OpenAI agent tracing. It does not use genai utils and it not compliant with semconv.

This PR rewrites it to util-genai and changes which telemetry is emitted by the lib:

  1. chat, embeddings, future speech and transcriptions belong in openai instrumentation, not in agents. Removing from agents instrumentation
  2. handoff and guardrails are not speced-out yet, removing for now

This PR also drops previous configuration options unique to this instrumentation (setting system name, unique ways to enable content and metrics)

Comparison of instrumentation options is below, TracingProcessor still seems like the best option with some minor gaps (tool call id is not reported, but can be through some hacks)

(1) Pure monkey-patch (public API only) (2) Hooks only (3) TracingProcessor only
Public-API surfaces used Runner.run / run_sync / run_streamed; FunctionTool.on_invoke_tool per instance (or @function_tool decorator); Handoff.on_invoke_handoff per instance Runner.run(..., hooks=...) (still need a Runner.run wrap to inject); _ChainedRunHooks to coexist with user hooks agents.tracing.add_trace_processor() + react to Trace, AgentSpanData, HandoffSpanData, FunctionSpanData
Workflow span ✅ wrap Runner.run* ⚠️ no hook for it — still need Runner.run wrap on_trace_start / on_trace_end
Agent span no clean boundary — agent lifetime is loop state in AgentRunner.run. Closest: attach AgentHooks per Agent (mutating user objects) or wrap Agent.__init__ — both ugly on_agent_start / on_agent_end AgentSpanData start/end
Tool span ⚠️ wrap FunctionTool instances at construction — works for @function_tool users; harder for direct construction; has full ToolContext (tool_call_id ✓) on_tool_start / on_tool_end with full ToolContext ⚠️ FunctionSpanData carries name / input / output but no tool_call_id → semconv non-compliant
Handoff ⚠️ wrap Handoff.on_invoke_handoff per instance on_handoff(from, to) HandoffSpanData has both from_agent and to_agent
LLM spans n/a — openai instrumentation owns these same same
Context propagation ✅ all wraps execute in the call-site task; OTel attach/detach balanced; OTel parent tree nests naturally broken — hooks fire via asyncio.gather (child tasks). util-genai attaches in child, detaches in different child → Failed to detach context. All our spans end up as siblings of workflow (not nested under agent/tool) ✅ callbacks fire synchronously in the task that called Span.start() (run loop task or per-tool task) → context-safe, natural nesting
Composes with user code minimal interference; users keep their own hooks must chain with user-supplied hooks= ✅ no interference with user hooks or runner subclassing
Version stability best — Runner / FunctionTool / Handoff are documented public surfaces best — RunHooks is stable documented API weakest — tied to SpanData subclass layout in agents.tracing (the library has restructured these between versions)
Pros Most version-stable; touches only documented API Single integration surface; full ToolContext available; users can compose Context-safe; single integration; rich semantic signal (trace, agent, handoff)
Cons No way to express "agent boundary" without ugliness; per-instance wrapping of FunctionTool requires intercepting construction across multiple creation paths (@function_tool, direct, etc.) Context-isolation breaks span nesting and floods logs with Failed to detach context tool_call_id missing on FunctionSpanData → semconv-blocking; less stable extension surface

Openai agents only view after this change

image

Copilot AI review requested due to automatic review settings May 23, 2026 19:50
@lmolkova lmolkova requested a review from a team as a code owner May 23, 2026 19:50
@lmolkova lmolkova marked this pull request as draft May 23, 2026 19:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR rewrites the opentelemetry-instrumentation-genai-openai-agents package to emit GenAI telemetry via opentelemetry-util-genai (instead of custom span enrichment), aiming to resolve semconv conformance issues (Fix #86) and align responsibility boundaries (LLM spans owned by the OpenAI SDK instrumentation, orchestration spans owned here).

Changes:

  • Replace the previous agents tracing-to-OTel span processor with a TracingProcessor bridge that creates util-genai workflow/agent/tool invocations.
  • Add conformance support for openai-agents (scenario + cassette + tox env) and add per-scenario “expected violation” suppression support in the shared conformance runner.
  • Update policies to elevate unknown gen_ai.operation.name values and “status=OK set by instrumentation” to violations.

Reviewed changes

Copilot reviewed 33 out of 34 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
util/opentelemetry-test-util-genai/src/opentelemetry/test_util_genai/conformance.py Add ExpectedViolation and enforce expected-vs-unexpected weaver violations in conformance runs.
tox.ini Add a dedicated openai-agents conformance tox env and split normal vs conformance test runs.
pyproject.toml Include openai-agents instrumentation in pyright scope; exclude its tests/examples from type checking.
policies/genai_span_validation.rego Add stricter violations for unknown gen_ai.operation.name and instrumentation setting status=ok.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/test_zz_coverage_improvements.py Remove old unit test suite targeting the previous custom span processor.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/test_z_span_processor_unit.py Remove unit tests for the deleted span_processor.py implementation.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/test_z_instrumentor_behaviors.py Remove tests tied to prior processor/config knobs.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/test_tracer.py Remove tests built around local tracing stubs and the old span processor.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/test_processor.py Add unit tests for the new GenAITracingProcessor bridge behavior and weakref cleanup.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/test_instrumentor.py Add tests for processor registration/unregistration behavior (including replace mode).
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/test_conformance.py Add conformance test entry point for openai-agents scenarios.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/stubs/wrapt.py Remove no-longer-needed test stub.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/stubs/opentelemetry/instrumentation/instrumentor.py Remove no-longer-needed test stub.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/stubs/agents/tracing/traces.py Remove agents tracing stub (tests now run against real library + VCR).
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/stubs/agents/tracing/spans.py Remove agents tracing stub.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/stubs/agents/tracing/processor_interface.py Remove agents tracing stub.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/stubs/agents/tracing/init.py Remove agents tracing stub package.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/conftest.py Switch to shared opentelemetry.test_util_genai fixtures + VCR config and header scrubbing.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/conformance/orchestration.py Add a conformance scenario covering agent handoff + tool execution with VCR playback.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/conformance/init.py Remove stub-only comment; keep package minimal.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/tests/cassettes/orchestration_conformance.yaml Add recorded cassette for the orchestration conformance scenario.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/src/opentelemetry/instrumentation/genai/openai_agents/span_processor.py Delete the prior custom “semantic processor” implementation.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/src/opentelemetry/instrumentation/genai/openai_agents/processor.py Introduce GenAITracingProcessor that maps agents tracing callbacks to util-genai invocations.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/src/opentelemetry/instrumentation/genai/openai_agents/init.py Update instrumentor to register the new tracing processor and drop old custom config knobs.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/examples/zero-code/main.py Update zero-code example to focus on agent orchestration (handoff + tool).
instrumentation/opentelemetry-instrumentation-genai-openai-agents/examples/zero-code/.env.example Expand env guidance for semconv opt-in and content capture (contains a typo).
instrumentation/opentelemetry-instrumentation-genai-openai-agents/examples/manual/requirements.txt Ensure manual example installs both openai and openai-agents instrumentations from repo.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/examples/manual/main.py Expand manual example to configure traces/metrics/logs and instrument both OpenAI + agents.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/examples/manual/.env.example Expand env guidance for semconv opt-in and content capture (contains a typo).
instrumentation/opentelemetry-instrumentation-genai-openai-agents/examples/content-capture/README.md Remove content-capture example docs tied to removed custom content-capture implementation.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/examples/content-capture/main.py Remove content-capture example code tied to removed custom span processor.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/examples/content-capture/.env.example Remove content-capture example env template.
instrumentation/opentelemetry-instrumentation-genai-openai-agents/.changelog/86.changed Document the rewrite, removed surfaces, and the known gen_ai.tool.call.id gap.

Comment on lines +100 to +114
if kwargs.get("disable_openai_trace_export"):
set_trace_processors([self._processor])
else:
add_trace_processor(self._processor)

tracing = _load_tracing_module()
provider = tracing.get_trace_provider()
existing = _get_registered_processors(provider)
provider.set_processors([*existing, processor])
self._processor = processor

def _uninstrument(self, **kwargs) -> None:
def _uninstrument(self, **kwargs: Any) -> None:
if self._processor is None:
return

tracing = _load_tracing_module()
provider = tracing.get_trace_provider()
current = _get_registered_processors(provider)
filtered = [proc for proc in current if proc is not self._processor]
provider.set_processors(filtered)

provider = get_trace_provider()
current = getattr(
getattr(provider, "_multi_processor", None), "_processors", ()
)
filtered = [p for p in current if p is not self._processor]
set_trace_processors(filtered)
Comment on lines +6 to +16
Registers a :class:`GenAITracingProcessor` with the agents library's
public ``add_trace_processor`` extension API. The processor reacts
synchronously to the agents library's own ``Trace`` / ``AgentSpan`` /
``FunctionSpan`` / ``HandoffSpan`` start/end callbacks and turns them
into ``invoke_workflow`` / ``invoke_agent`` / ``execute_tool`` spans via
``opentelemetry-util-genai`` (plus a raw ``handoff`` span until handoff
semconv lands).

LLM-level spans (``chat`` / ``responses`` / ``embeddings``) are produced
by ``opentelemetry-instrumentation-genai-openai`` when both packages are
installed; this instrumentation does not emit them.
Comment on lines +72 to +77
class OrchestrationScenario(Scenario):
expected_spans = (
"invoke_agent",
"execute_tool",
)
expected_metrics = ("gen_ai.client.operation.duration",)
# OTEL_GENAI_AGENT_NAME=Travel Concierge
# Remove to hide prompt and completion content
# Possible values (case insensitive):
# - `span_only` - record content on span attibutes

# Remove to hide prompt and completion content
# Possible values (case insensitive):
# - `span_only` - record content on span attibutes
@lmolkova lmolkova force-pushed the openai-agents-rewrite-to-utils branch from 74c3521 to 49f5405 Compare June 6, 2026 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenAI agents: address conformance test issues

2 participants