Skip to content

fix(reasoning): detect short-form "READY" when concluding a plan#6206

Open
i-anubhav-anand wants to merge 1 commit into
crewAIInc:mainfrom
i-anubhav-anand:fix/reasoning-plan-ready-detection
Open

fix(reasoning): detect short-form "READY" when concluding a plan#6206
i-anubhav-anand wants to merge 1 commit into
crewAIInc:mainfrom
i-anubhav-anand:fix/reasoning-plan-ready-detection

Conversation

@i-anubhav-anand

@i-anubhav-anand i-anubhav-anand commented Jun 17, 2026

Copy link
Copy Markdown

Summary

Fixes #6204. When an agent uses reasoning/planning, the executor asks the model to conclude its plan with a readiness signal and re-plans while "not ready". The model frequently replies with a bare READY, but the parser only recognised the full sentence READY: I am ready to execute the task. — so a ready model was read as not ready and the agent kept refining the plan (indefinitely when max_attempts is None).

The mismatch is between the prompts and the parser:

Prompt (translations/en.json) What it asks the model to output
reasoning.create_plan_prompt "READY: I am ready to execute the task." (full sentence)
reasoning.refine_plan_prompt Conclude with READY or NOT READY. (bare token)
planning.create_plan_prompt After your plan, state READY or NOT READY. (bare token)
planning.refine_plan_prompt Conclude with READY or NOT READY as before. (bare token)

The readiness check only matched the full sentence:

# reasoning_handler.py — before
ready = "READY: I am ready to execute the task." in response

So three of the four prompts could never produce a "ready" verdict. In _refine_plan_if_needed, while not ready and (max_attempts is None or attempt < max_attempts) then loops — wasted refine calls, or no termination at all with the default max_attempts=None.

Fix

Add AgentReasoning._check_ready, used by _parse_planning_response and the two _call_with_function fallback paths. It accepts both forms and inspects only the last meaningful line for the short form, so a plan body that merely mentions the word "ready" isn't misread, and NOT READY is never matched:

@staticmethod
def _check_ready(response: str) -> bool:
    if "READY: I am ready to execute the task." in response:
        return True
    for line in reversed(response.splitlines()):
        conclusion = line.strip().strip("*`#_-.!:> ").upper()
        if not conclusion:
            continue
        return conclusion == "READY"
    return False

Surrounding markdown emphasis/punctuation (**READY**, ## READY, READY.) is stripped and pure-separator lines (---) are skipped, since real model output (the report used an Ollama GLM model) is rarely a clean bare token.

Test plan

New TestCheckReady (15 cases) in tests/utilities/test_structured_planning.py covering full-sentence and short forms, markdown/heading/separator variants, and the critical negatives (NOT READY, ALREADY DONE, body-only "ready").

Fail-before / pass-after (source stashed, tests kept):

  • Before (origin/main): 13 failed, 2 passed — e.g. test_parse_planning_response_short_ready fails assert False is True.
  • After: 15 passed.

Full file: 38 passed; the 3 remaining failures are pre-existing optional-provider import errors (azure/anthropic/gemini native providers not installed) unrelated to this change. ruff check clean.

Summary by CodeRabbit

  • Bug Fixes
    • Improved agent readiness detection to reliably interpret multiple “READY/NOT READY” response formats (including markdown, headings, and punctuation) while reducing false positives from unrelated text.
    • Updated structured planning parsing to use the same readiness detection logic for consistent ready behavior.
  • Tests
    • Added unit tests covering common and edge-case planning response formats, including empty responses and “ALREADY DONE” negative scenarios.

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ff00e917-6424-494d-804b-2e2460e27c04

📥 Commits

Reviewing files that changed from the base of the PR and between 4f33415 and 119d246.

📒 Files selected for processing (2)
  • lib/crewai/src/crewai/utilities/reasoning_handler.py
  • lib/crewai/tests/utilities/test_structured_planning.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • lib/crewai/tests/utilities/test_structured_planning.py
  • lib/crewai/src/crewai/utilities/reasoning_handler.py

📝 Walkthrough

Walkthrough

Replaces hardcoded substring checks for agent readiness with a new AgentReasoning._check_ready() static method that handles full-sentence and short-form READY/NOT READY outputs, normalizes Markdown formatting and punctuation, and inspects only the last meaningful line. All three prior call sites and _parse_planning_response are updated. A new TestCheckReady test suite is added.

Changes

Readiness Detection Refactor

Layer / File(s) Summary
_check_ready implementation and call site updates
lib/crewai/src/crewai/utilities/reasoning_handler.py
Adds _check_ready(response: str) -> bool static method with robust short-form and full-sentence detection, normalizing Markdown bold, headings, punctuation, and empty/separator lines. Updates the function-calling path (line 412), fallback path (line 436), and _parse_planning_response (line 619) to call _check_ready() instead of inline substring checks.
TestCheckReady unit tests
lib/crewai/tests/utilities/test_structured_planning.py
Adds TestCheckReady covering _check_ready across full-sentence, short-form, trailing punctuation, **READY**/**NOT READY**, ## READY headings, separator lines, ALREADY DONE false-positive guard, and empty input. Also tests _parse_planning_response for short-form READY/NOT READY and empty response defaults.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change—improving detection of short-form 'READY' responses in planning conclusions.
Linked Issues check ✅ Passed The PR fully addresses issue #6204 by implementing detection of bare 'READY' tokens and other formats that models output in practice.
Out of Scope Changes check ✅ Passed All changes directly support the fix: new _check_ready() method handles improved readiness detection, and test suite validates the implementation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/utilities/reasoning_handler.py`:
- Around line 596-602: The full-sentence READY match check currently scans the
entire response string, allowing the text "READY: I am ready to execute the
task." appearing anywhere in the response body or examples to immediately return
True and bypass the conclusion line evaluation. Move the full-sentence READY
match check to only apply to the conclusion variable (the final stripped line
being evaluated in the reversed loop), not to the entire response, so that
intermediate text in the response does not override the actual final conclusion
determination. This ensures that if the final conclusion line is "NOT READY",
the function will correctly return False instead of being short-circuited by an
earlier mention of the full sentence elsewhere in the response.

In `@lib/crewai/tests/utilities/test_structured_planning.py`:
- Around line 732-735: The test suite for AgentReasoning._check_ready is missing
a regression test for the case where a full READY sentence appears in the body
or middle of the response, but the final conclusion should still be NOT READY.
Add a new test method (similar to test_full_sentence_not_ready) that creates a
response containing the full READY sentence in the body followed by a NOT READY
conclusion, then verify that AgentReasoning._check_ready correctly returns False
for this case.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: af060d2d-276f-44d6-aff9-9e5ffae393af

📥 Commits

Reviewing files that changed from the base of the PR and between 431100d and 4f33415.

📒 Files selected for processing (2)
  • lib/crewai/src/crewai/utilities/reasoning_handler.py
  • lib/crewai/tests/utilities/test_structured_planning.py

Comment on lines +596 to +602
if "READY: I am ready to execute the task." in response:
return True
for line in reversed(response.splitlines()):
conclusion = line.strip().strip("*`#_-.!:> ").upper()
if not conclusion:
continue
return conclusion == "READY"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restrict the full-sentence READY match to the conclusion line.

Line 596 scans the entire response, so a body/example line containing READY: I am ready to execute the task. overrides a final NOT READY conclusion and stops refinement too early.

🐛 Proposed fix
-        if "READY: I am ready to execute the task." in response:
-            return True
         for line in reversed(response.splitlines()):
             conclusion = line.strip().strip("*`#_-.!:> ").upper()
             if not conclusion:
                 continue
-            return conclusion == "READY"
+            return conclusion in {
+                "READY",
+                "READY: I AM READY TO EXECUTE THE TASK",
+            }
         return False
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/utilities/reasoning_handler.py` around lines 596 - 602,
The full-sentence READY match check currently scans the entire response string,
allowing the text "READY: I am ready to execute the task." appearing anywhere in
the response body or examples to immediately return True and bypass the
conclusion line evaluation. Move the full-sentence READY match check to only
apply to the conclusion variable (the final stripped line being evaluated in the
reversed loop), not to the entire response, so that intermediate text in the
response does not override the actual final conclusion determination. This
ensures that if the final conclusion line is "NOT READY", the function will
correctly return False instead of being short-circuited by an earlier mention of
the full sentence elsewhere in the response.

Comment on lines +732 to +735
def test_full_sentence_not_ready(self):
"""Full-sentence NOT READY is correctly treated as not ready."""
response = "My plan.\n\nNOT READY: I need to refine because the tools are unclear."
assert AgentReasoning._check_ready(response) is False

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a regression for full-sentence READY in the body.

This suite misses the case that currently fails: a body/example line contains the full READY sentence, but the final conclusion is NOT READY.

🧪 Proposed test
     def test_full_sentence_not_ready(self):
         """Full-sentence NOT READY is correctly treated as not ready."""
         response = "My plan.\n\nNOT READY: I need to refine because the tools are unclear."
         assert AgentReasoning._check_ready(response) is False
+
+    def test_full_sentence_ready_in_body_with_not_ready_conclusion(self):
+        """Full-sentence READY in the body must not override final NOT READY."""
+        response = (
+            "The prompt says to answer READY: I am ready to execute the task.\n\n"
+            "NOT READY"
+        )
+        assert AgentReasoning._check_ready(response) is False
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_full_sentence_not_ready(self):
"""Full-sentence NOT READY is correctly treated as not ready."""
response = "My plan.\n\nNOT READY: I need to refine because the tools are unclear."
assert AgentReasoning._check_ready(response) is False
def test_full_sentence_not_ready(self):
"""Full-sentence NOT READY is correctly treated as not ready."""
response = "My plan.\n\nNOT READY: I need to refine because the tools are unclear."
assert AgentReasoning._check_ready(response) is False
def test_full_sentence_ready_in_body_with_not_ready_conclusion(self):
"""Full-sentence READY in the body must not override final NOT READY."""
response = (
"The prompt says to answer READY: I am ready to execute the task.\n\n"
"NOT READY"
)
assert AgentReasoning._check_ready(response) is False
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/tests/utilities/test_structured_planning.py` around lines 732 -
735, The test suite for AgentReasoning._check_ready is missing a regression test
for the case where a full READY sentence appears in the body or middle of the
response, but the final conclusion should still be NOT READY. Add a new test
method (similar to test_full_sentence_not_ready) that creates a response
containing the full READY sentence in the body followed by a NOT READY
conclusion, then verify that AgentReasoning._check_ready correctly returns False
for this case.

The refine-plan and planning prompts instruct the model to conclude with
a bare "READY" or "NOT READY", but the readiness parser only recognised the
full sentence "READY: I am ready to execute the task." (emitted by the
initial-plan prompt). A model replying with the short form was therefore read
as "not ready", so the agent kept re-planning — indefinitely when
max_attempts is None (crewAIInc#6204).

Add an AgentReasoning._check_ready helper that accepts both the full-sentence
form and the short "READY"/"NOT READY" form on the last meaningful line,
stripping markdown emphasis/punctuation and skipping separator lines, while
never matching "NOT READY". Used by _parse_planning_response and the two
_call_with_function fallback paths.
@i-anubhav-anand i-anubhav-anand force-pushed the fix/reasoning-plan-ready-detection branch from 4f33415 to 119d246 Compare June 17, 2026 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Reasoning plan always detects "NOT READY" even though the model indicates "READY"

1 participant