Summary
I'd like to propose (and contribute) a native Oracle Cloud Infrastructure (OCI) Generative AI model provider under agents.extensions.models, following the same optional-extra pattern as the existing LiteLLM integration.
OCI Generative AI exposes a large hosted catalog (OpenAI GPT/o-series, Meta Llama, Cohere Command, Google Gemini, xAI Grok, Mistral, and others) behind OCI IAM. Today the only way to use it with the Agents SDK is through a generic adapter, which doesn't cover how the service actually authenticates or the transports it requires.
Why the existing adapters aren't enough
- Authentication is OCI request signing, not API keys. Requests must be signed per-request with one of four IAM modes: API key (user principal), session token, instance principal, or resource principal. Session tokens and principal-based signers expire and must be rebuilt/refreshed transparently mid-run. None of this maps onto a bearer-token
base_url override or a generic adapter's auth model.
- Compartment routing. Every request must carry the target compartment, either as an
opc-compartment-id header (OpenAI-compatible endpoints) or as a compartmentId body field (native SDK transport).
- The catalog spans three distinct wire transports (details below), and the right one is determined by the model ID. A single chat-completions shim can't reach the whole catalog.
Proposed design
A new optional extra (pip install "openai-agents[oci]") providing OCIProvider / model classes in src/agents/extensions/models/, implementing the existing Model / ModelProvider interfaces. The provider routes each model ID to one of three transports:
- OpenAI-compatible chat completions —
https://inference.generativeai.{region}.oci.oraclecloud.com/openai/v1/chat/completions, used by most of the catalog (OpenAI, Meta, xAI, Google, Mistral models). Implemented with the openai client plus an httpx auth hook that performs OCI request signing and injects opc-compartment-id. Streaming is standard SSE.
- OCI Python SDK (native) transport —
GenerativeAiInferenceClient with ChatDetails, used for Cohere Command R-series models (which use Cohere's native message/tool format: current message + chat history, CohereTool parameter definitions) and for dedicated AI cluster endpoints (ocid1.generativeaiendpoint.* serving targets).
- Responses transport —
https://inference.generativeai.{region}.oci.oraclecloud.com/openai/v1/responses, required for Responses-only reasoning models. Server-stateful (previous_response_id continuation), with an opt-out (store=false) for tenancies with Zero Data Retention enabled, in which case the full history is sent each turn.
Routing rules: ocid1.generativeaiendpoint.* and cohere.command-r* → native SDK transport; models that are Responses-only → Responses transport; everything else → OpenAI-compatible chat completions.
Tool calling, structured output, and usage reporting would be normalized to the SDK's existing item types in all three cases, with the known per-vendor quirks handled inside the provider (e.g. max_completion_tokens vs max_tokens field naming, Gemini's restrictions on $ref in JSON schemas and on parallel tool call turn shape, synthesized tool-call IDs for Cohere).
The dependency footprint is oci (the OCI Python SDK) as an optional extra, mirroring how litellm, redis, boto3, etc. are handled today.
Alternatives considered
- LiteLLM adapter: covers a slice of the catalog through its own OCI support, but adds a second compatibility layer, and doesn't expose session-token/instance-principal/resource-principal auth with refresh, the native Cohere transport, dedicated endpoints, or the OCI Responses transport.
base_url override on the built-in OpenAI client: not possible, since OCI uses request signing rather than bearer tokens.
Offer
I have this working end-to-end against OCI Generative AI and would be happy to submit a PR implementing the above (provider + unit tests with mocked transports + docs). Opening this issue first to align on scope and placement before sending the code.
- Agents SDK version:
main (post v0.x latest)
- Python version: 3.10+
Summary
I'd like to propose (and contribute) a native Oracle Cloud Infrastructure (OCI) Generative AI model provider under
agents.extensions.models, following the same optional-extra pattern as the existing LiteLLM integration.OCI Generative AI exposes a large hosted catalog (OpenAI GPT/o-series, Meta Llama, Cohere Command, Google Gemini, xAI Grok, Mistral, and others) behind OCI IAM. Today the only way to use it with the Agents SDK is through a generic adapter, which doesn't cover how the service actually authenticates or the transports it requires.
Why the existing adapters aren't enough
base_urloverride or a generic adapter's auth model.opc-compartment-idheader (OpenAI-compatible endpoints) or as acompartmentIdbody field (native SDK transport).Proposed design
A new optional extra (
pip install "openai-agents[oci]") providingOCIProvider/ model classes insrc/agents/extensions/models/, implementing the existingModel/ModelProviderinterfaces. The provider routes each model ID to one of three transports:https://inference.generativeai.{region}.oci.oraclecloud.com/openai/v1/chat/completions, used by most of the catalog (OpenAI, Meta, xAI, Google, Mistral models). Implemented with theopenaiclient plus anhttpxauth hook that performs OCI request signing and injectsopc-compartment-id. Streaming is standard SSE.GenerativeAiInferenceClientwithChatDetails, used for Cohere Command R-series models (which use Cohere's native message/tool format: current message + chat history,CohereToolparameter definitions) and for dedicated AI cluster endpoints (ocid1.generativeaiendpoint.*serving targets).https://inference.generativeai.{region}.oci.oraclecloud.com/openai/v1/responses, required for Responses-only reasoning models. Server-stateful (previous_response_idcontinuation), with an opt-out (store=false) for tenancies with Zero Data Retention enabled, in which case the full history is sent each turn.Routing rules:
ocid1.generativeaiendpoint.*andcohere.command-r*→ native SDK transport; models that are Responses-only → Responses transport; everything else → OpenAI-compatible chat completions.Tool calling, structured output, and usage reporting would be normalized to the SDK's existing item types in all three cases, with the known per-vendor quirks handled inside the provider (e.g.
max_completion_tokensvsmax_tokensfield naming, Gemini's restrictions on$refin JSON schemas and on parallel tool call turn shape, synthesized tool-call IDs for Cohere).The dependency footprint is
oci(the OCI Python SDK) as an optional extra, mirroring howlitellm,redis,boto3, etc. are handled today.Alternatives considered
base_urloverride on the built-in OpenAI client: not possible, since OCI uses request signing rather than bearer tokens.Offer
I have this working end-to-end against OCI Generative AI and would be happy to submit a PR implementing the above (provider + unit tests with mocked transports + docs). Opening this issue first to align on scope and placement before sending the code.
main(postv0.xlatest)