Skip to content

Native OCI Generative AI model provider (three transports: OpenAI-compatible, native SDK, Responses) #3611

@fede-kamel

Description

@fede-kamel

Summary

I'd like to propose (and contribute) a native Oracle Cloud Infrastructure (OCI) Generative AI model provider under agents.extensions.models, following the same optional-extra pattern as the existing LiteLLM integration.

OCI Generative AI exposes a large hosted catalog (OpenAI GPT/o-series, Meta Llama, Cohere Command, Google Gemini, xAI Grok, Mistral, and others) behind OCI IAM. Today the only way to use it with the Agents SDK is through a generic adapter, which doesn't cover how the service actually authenticates or the transports it requires.

Why the existing adapters aren't enough

  • Authentication is OCI request signing, not API keys. Requests must be signed per-request with one of four IAM modes: API key (user principal), session token, instance principal, or resource principal. Session tokens and principal-based signers expire and must be rebuilt/refreshed transparently mid-run. None of this maps onto a bearer-token base_url override or a generic adapter's auth model.
  • Compartment routing. Every request must carry the target compartment, either as an opc-compartment-id header (OpenAI-compatible endpoints) or as a compartmentId body field (native SDK transport).
  • The catalog spans three distinct wire transports (details below), and the right one is determined by the model ID. A single chat-completions shim can't reach the whole catalog.

Proposed design

A new optional extra (pip install "openai-agents[oci]") providing OCIProvider / model classes in src/agents/extensions/models/, implementing the existing Model / ModelProvider interfaces. The provider routes each model ID to one of three transports:

  1. OpenAI-compatible chat completionshttps://inference.generativeai.{region}.oci.oraclecloud.com/openai/v1/chat/completions, used by most of the catalog (OpenAI, Meta, xAI, Google, Mistral models). Implemented with the openai client plus an httpx auth hook that performs OCI request signing and injects opc-compartment-id. Streaming is standard SSE.
  2. OCI Python SDK (native) transportGenerativeAiInferenceClient with ChatDetails, used for Cohere Command R-series models (which use Cohere's native message/tool format: current message + chat history, CohereTool parameter definitions) and for dedicated AI cluster endpoints (ocid1.generativeaiendpoint.* serving targets).
  3. Responses transporthttps://inference.generativeai.{region}.oci.oraclecloud.com/openai/v1/responses, required for Responses-only reasoning models. Server-stateful (previous_response_id continuation), with an opt-out (store=false) for tenancies with Zero Data Retention enabled, in which case the full history is sent each turn.

Routing rules: ocid1.generativeaiendpoint.* and cohere.command-r* → native SDK transport; models that are Responses-only → Responses transport; everything else → OpenAI-compatible chat completions.

Tool calling, structured output, and usage reporting would be normalized to the SDK's existing item types in all three cases, with the known per-vendor quirks handled inside the provider (e.g. max_completion_tokens vs max_tokens field naming, Gemini's restrictions on $ref in JSON schemas and on parallel tool call turn shape, synthesized tool-call IDs for Cohere).

The dependency footprint is oci (the OCI Python SDK) as an optional extra, mirroring how litellm, redis, boto3, etc. are handled today.

Alternatives considered

  • LiteLLM adapter: covers a slice of the catalog through its own OCI support, but adds a second compatibility layer, and doesn't expose session-token/instance-principal/resource-principal auth with refresh, the native Cohere transport, dedicated endpoints, or the OCI Responses transport.
  • base_url override on the built-in OpenAI client: not possible, since OCI uses request signing rather than bearer tokens.

Offer

I have this working end-to-end against OCI Generative AI and would be happy to submit a PR implementing the above (provider + unit tests with mocked transports + docs). Opening this issue first to align on scope and placement before sending the code.

  • Agents SDK version: main (post v0.x latest)
  • Python version: 3.10+

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions