Add reasoning content normalization middleware #896

snimu · 2026-02-11T06:15:13Z

Description

Quick fix for missing reasoning content in GLM 4.7-Flash.

Normalize reasoning from provider-specific fields (reasoning_content, reasoning) into tags in content, and strip them on outbound requests to providers that use separate reasoning fields.

New verifiers/utils/reasoning_utils.py with pure utility functions
Inbound: extract reasoning from responses, prepend as tags
Outbound: strip tags from assistant messages for providers that use reasoning_content/reasoning fields
Auto-detect reasoning format from first response
Accept reasoning-only responses in validation (no EmptyModelResponseError)

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Test improvement

Testing

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Note

Medium Risk
Touches core request/response handling and validation for model calls; behavior changes depend on provider response shape and could affect prompts or error handling if detection/stripping misfires.

Overview
Normalizes provider-specific reasoning fields into a consistent <think>...</think> prefix on inbound chat responses, so models that return reasoning_content/reasoning no longer drop reasoning.

Adds reasoning_format handling to Environment: strips <think> blocks from outbound assistant messages for providers that expect separate reasoning fields, auto-detects the format from the first response, and relaxes response validation to accept reasoning-only replies. Includes a new reasoning_utils module and comprehensive unit/integration tests covering extraction, normalization, stripping, and provider message prep.

^{Written by Cursor Bugbot for commit fcf95fa. This will update automatically on new commits. Configure here.}

Normalize reasoning from provider-specific fields (reasoning_content, reasoning) into <think> tags in content, and strip them on outbound requests to providers that use separate reasoning fields. - New verifiers/utils/reasoning_utils.py with pure utility functions - Inbound: extract reasoning from responses, prepend as <think> tags - Outbound: strip <think> tags from assistant messages for providers that use reasoning_content/reasoning fields - Auto-detect reasoning format from first response - Accept reasoning-only responses in validation (no EmptyModelResponseError) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-11T06:24:25Z

verifiers/envs/environment.py

                f"Model returned {len(response.choices)} choices, expected 1"
            )
        if isinstance(response.choices[0], Choice):
-            if not (


Think tags not stripped for interleaved rollout path

Medium Severity

prepare_messages_for_provider is called in get_model_response_with_messages to strip <think> tags from assistant messages before sending to providers that use separate reasoning fields, but get_model_response_with_tokens (used when interleaved_rollouts=True and trajectory is non-empty) sends prompt directly as messages without any such stripping. In multi-turn interleaved rollouts with providers using reasoning_content/reasoning, assistant messages will still contain <think> tags the provider doesn't expect.

Additional Locations (1)

verifiers/envs/environment.py#L693-L704

cursor bot reviewed Feb 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reasoning content normalization middleware #896

Add reasoning content normalization middleware #896

snimu commented Feb 11, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add reasoning content normalization middleware #896

Are you sure you want to change the base?

Add reasoning content normalization middleware #896

Conversation

snimu commented Feb 11, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Checklist

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 11, 2026

Choose a reason for hiding this comment

Think tags not stripped for interleaved rollout path

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

snimu commented Feb 11, 2026 •

edited by cursor bot

Loading