Skip to content

Responses streaming structured output parses incomplete JSON before terminal incomplete status #3263

@egit-m

Description

@egit-m

Confirm this is an issue with the Python library and not an underlying OpenAI API

This is an issue with the Python library.

Describe the bug

When using the Responses streaming helper with structured output auto-parsing (text_format=SomePydanticModel), the SDK parses output text on the response.output_text.done event before the terminal response status is known.

If the API later emits response.incomplete with incomplete_details.reason (for example because output was truncated by max_output_tokens), the SDK can raise a Pydantic JSON validation error such as Invalid JSON: EOF while parsing an object. That makes an upstream incomplete response look like malformed business JSON, and application code never gets a clean way to treat status="incomplete" as the primary failure.

From the current main branch, this appears to come from ResponseStreamState.handle_event() parsing on response.output_text.done:

https://github.com/openai/openai-python/blob/main/src/openai/lib/streaming/responses/_responses.py

and parse_text() delegating directly to Pydantic JSON parsing:

https://github.com/openai/openai-python/blob/main/src/openai/lib/_parsing/_responses.py

The API response model already exposes status and incomplete_details.reason:

https://github.com/openai/openai-python/blob/main/src/openai/types/responses/response.py

Expected behavior: structured output parsing in the streaming helper should not mask terminal response.incomplete. Either parsing should be deferred until response.completed, or the SDK should raise a specific SDK-level failure for incomplete responses that includes response.status, response.incomplete_details.reason, and the response id.

To Reproduce

  1. Use client.responses.stream(...) with a Pydantic text_format.
  2. Force an incomplete/truncated response, for example with a very small max_output_tokens.
  3. Consume the stream.
  4. Observe that structured output auto-parsing can raise a Pydantic JSON validation error before caller code can handle the terminal response.incomplete event/status as the real upstream failure.

Code snippets

import os
from pydantic import BaseModel
from openai import OpenAI


class Payload(BaseModel):
    value: str


client = OpenAI()

with client.responses.stream(
    model=os.environ.get("OPENAI_MODEL", "gpt-5.4"),
    input="Return JSON matching the schema with value set to a long sentence.",
    text_format=Payload,
    max_output_tokens=1,
) as stream:
    for event in stream:
        print(event.type)

    # In the incomplete case, application code should be able to inspect
    # response.status and response.incomplete_details.reason before any
    # business-schema parsing is attempted.
    response = stream.get_final_response()
    print(response.status, response.incomplete_details)

A workaround is to avoid streaming auto-parse, pass the JSON schema through text.format, keep consuming stream updates, then after terminal status is known only parse response.output_text when response.status == "completed".

OS

Linux

Python version

Python 3.12.3

Library version

Observed on openai==2.9.0. I also checked the current main branch source on 2026-05-18, where pyproject.toml reports 2.37.0, and the same early parse behavior appears to still be present.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions