Confirm this is an issue with the Python library and not an underlying OpenAI API
This is an issue with the Python library.
Describe the bug
When using the Responses streaming helper with structured output auto-parsing (text_format=SomePydanticModel), the SDK parses output text on the response.output_text.done event before the terminal response status is known.
If the API later emits response.incomplete with incomplete_details.reason (for example because output was truncated by max_output_tokens), the SDK can raise a Pydantic JSON validation error such as Invalid JSON: EOF while parsing an object. That makes an upstream incomplete response look like malformed business JSON, and application code never gets a clean way to treat status="incomplete" as the primary failure.
From the current main branch, this appears to come from ResponseStreamState.handle_event() parsing on response.output_text.done:
https://github.com/openai/openai-python/blob/main/src/openai/lib/streaming/responses/_responses.py
and parse_text() delegating directly to Pydantic JSON parsing:
https://github.com/openai/openai-python/blob/main/src/openai/lib/_parsing/_responses.py
The API response model already exposes status and incomplete_details.reason:
https://github.com/openai/openai-python/blob/main/src/openai/types/responses/response.py
Expected behavior: structured output parsing in the streaming helper should not mask terminal response.incomplete. Either parsing should be deferred until response.completed, or the SDK should raise a specific SDK-level failure for incomplete responses that includes response.status, response.incomplete_details.reason, and the response id.
To Reproduce
- Use
client.responses.stream(...) with a Pydantic text_format.
- Force an incomplete/truncated response, for example with a very small
max_output_tokens.
- Consume the stream.
- Observe that structured output auto-parsing can raise a Pydantic JSON validation error before caller code can handle the terminal
response.incomplete event/status as the real upstream failure.
Code snippets
import os
from pydantic import BaseModel
from openai import OpenAI
class Payload(BaseModel):
value: str
client = OpenAI()
with client.responses.stream(
model=os.environ.get("OPENAI_MODEL", "gpt-5.4"),
input="Return JSON matching the schema with value set to a long sentence.",
text_format=Payload,
max_output_tokens=1,
) as stream:
for event in stream:
print(event.type)
# In the incomplete case, application code should be able to inspect
# response.status and response.incomplete_details.reason before any
# business-schema parsing is attempted.
response = stream.get_final_response()
print(response.status, response.incomplete_details)
A workaround is to avoid streaming auto-parse, pass the JSON schema through text.format, keep consuming stream updates, then after terminal status is known only parse response.output_text when response.status == "completed".
OS
Linux
Python version
Python 3.12.3
Library version
Observed on openai==2.9.0. I also checked the current main branch source on 2026-05-18, where pyproject.toml reports 2.37.0, and the same early parse behavior appears to still be present.
Confirm this is an issue with the Python library and not an underlying OpenAI API
This is an issue with the Python library.
Describe the bug
When using the Responses streaming helper with structured output auto-parsing (
text_format=SomePydanticModel), the SDK parses output text on theresponse.output_text.doneevent before the terminal response status is known.If the API later emits
response.incompletewithincomplete_details.reason(for example because output was truncated bymax_output_tokens), the SDK can raise a Pydantic JSON validation error such asInvalid JSON: EOF while parsing an object. That makes an upstream incomplete response look like malformed business JSON, and application code never gets a clean way to treatstatus="incomplete"as the primary failure.From the current
mainbranch, this appears to come fromResponseStreamState.handle_event()parsing onresponse.output_text.done:https://github.com/openai/openai-python/blob/main/src/openai/lib/streaming/responses/_responses.py
and
parse_text()delegating directly to Pydantic JSON parsing:https://github.com/openai/openai-python/blob/main/src/openai/lib/_parsing/_responses.py
The API response model already exposes
statusandincomplete_details.reason:https://github.com/openai/openai-python/blob/main/src/openai/types/responses/response.py
Expected behavior: structured output parsing in the streaming helper should not mask terminal
response.incomplete. Either parsing should be deferred untilresponse.completed, or the SDK should raise a specific SDK-level failure for incomplete responses that includesresponse.status,response.incomplete_details.reason, and the response id.To Reproduce
client.responses.stream(...)with a Pydantictext_format.max_output_tokens.response.incompleteevent/status as the real upstream failure.Code snippets
A workaround is to avoid streaming auto-parse, pass the JSON schema through
text.format, keep consuming stream updates, then after terminal status is known only parseresponse.output_textwhenresponse.status == "completed".OS
Linux
Python version
Python 3.12.3
Library version
Observed on
openai==2.9.0. I also checked the currentmainbranch source on 2026-05-18, wherepyproject.tomlreports2.37.0, and the same early parse behavior appears to still be present.