Skip to content

feat(agent): add token-by-token streaming to tool_calling_agent#1595

Merged
rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom
MylesShannon:feature/tool-calling-agent-streaming
Feb 18, 2026
Merged

feat(agent): add token-by-token streaming to tool_calling_agent#1595
rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom
MylesShannon:feature/tool-calling-agent-streaming

Conversation

@MylesShannon
Copy link
Contributor

@MylesShannon MylesShannon commented Feb 12, 2026

Description

This PR enables token-by-token streaming for the Tool Calling Agent when stream=true is set on chat completion requests. Previously, streaming requests returned the full response in a single Server-Sent Event; responses are now streamed incrementally as LLM tokens are produced.

Changes:

  • register.py: Added a streaming entry point _stream_fn that uses LangGraph astream(stream_mode="messages") to consume message chunks from the graph and yield only AIMessageChunk content from the agent node (excluding tool-call chunks). The Tool Calling Agent is now registered with FunctionInfo.create(single_fn=_response_fn, stream_fn=_stream_fn, ...) so both single-call and streaming are supported.
  • agent.py: The agent node now accepts RunnableConfig, forwards it to the underlying LLM, and uses self.agent.astream() instead of ainvoke() so the LLM streams. Chunks are accumulated into the final AIMessage for non-streaming paths, and config is merged so streaming callbacks propagate correctly.
  • Tests: test_tool_calling.py updated so agent_node is called with a mock config argument to match the new signature.

No new user-facing documentation was added; existing docstrings and inline comments were updated to describe the streaming behavior.

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • New Features

    • Real-time streaming for the tool-calling workflow, delivering incremental token content via a streaming entry.
  • Refactor

    • Agent interfaces now accept an explicit runtime/config parameter to control execution and propagate runtime settings.
  • Bug Fixes

    • Streamed chunks are concatenated into a single response, appended to message state, and an error is raised if no output is produced.
  • Tests

    • Tests updated to cover config-aware calls and streaming behavior.

@MylesShannon MylesShannon requested a review from a team as a code owner February 12, 2026 16:18
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 12, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link

coderabbitai bot commented Feb 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

agent_node signatures extended to accept a RunnableConfig; tool-calling now streams via graph.astream, accumulates AIMessageChunk content into a single response appended to state.messages. Registration gained an async streaming function path and tests updated to pass the new config parameter.

Changes

Cohort / File(s) Summary
Core Tool-Calling Agent
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py
agent_node signature now accepts config: RunnableConfig. Merges/augments incoming config (injects __pregel_runtime), uses agent.astream(..., config=run_config, stream_mode="messages") to stream AIMessageChunks, accumulates chunks into a single response, validates non-empty output, logs when enabled, appends final response to state.messages.
Registration & Streaming Path
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py
Added AsyncGenerator typing and a _stream_fn that yields token/content chunks from graph.astream (filters AIMessageChunk with agent-directed metadata, logs and re-raises on errors). Replaced FunctionInfo.from_fn(...) with FunctionInfo.create(single_fn=..., stream_fn=..., description=...).
Agent Interface Propagation
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/dual_node.py, packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py
Abstract and concrete agent_node method signatures updated to accept config: RunnableConfig, and internal calls now build/propagate a consolidated run_config (injecting __pregel_runtime).
Tests — Tool Calling
packages/nvidia_nat_langchain/tests/agent/test_tool_calling.py
Test call sites updated to pass the new config argument (e.g., config={"configurable": {}}) into ToolCallAgentGraph.agent_node.
Tests — ReAct Agent
packages/nvidia_nat_langchain/tests/agent/test_react.py
Updated test invocations to pass the new config argument into ReActAgentGraph.agent_node and related call paths.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant Agent as ToolCallAgentGraph.agent_node
    participant Graph as Graph.astream
    participant State as state/messages

    Caller->>Agent: call agent_node(state, config)
    Agent->>Agent: merge/augment config (inject __pregel_runtime)
    Agent->>Graph: call astream(..., config=run_config, stream_mode="messages")
    loop per streamed chunk
        Graph-->>Agent: AIMessageChunk (agent-directed) content
        Agent->>Agent: accumulate chunk into response
    end
    Agent->>Agent: validate response non-empty
    Agent->>State: append final response to state.messages
    Agent->>Caller: return updated state
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (154 files):

⚔️ .github/workflows/ci_pipe.yml (content)
⚔️ .gitlab-ci.yml (content)
⚔️ CHANGELOG.md (content)
⚔️ ci/scripts/common.sh (content)
⚔️ ci/scripts/github/build_wheel.sh (content)
⚔️ ci/scripts/gitlab/build_wheel.sh (content)
⚔️ ci/scripts/license_diff.py (content)
⚔️ ci/scripts/sbom_list.py (content)
⚔️ docs/source/build-workflows/a2a-client.md (content)
⚔️ docs/source/build-workflows/mcp-client.md (content)
⚔️ docs/source/build-workflows/workflow-configuration.md (content)
⚔️ docs/source/components/auth/mcp-auth/index.md (content)
⚔️ docs/source/components/integrations/a2a.md (content)
⚔️ docs/source/extend/custom-components/custom-functions/per-user-functions.md (content)
⚔️ docs/source/extend/custom-components/index.md (content)
⚔️ docs/source/extend/custom-components/mcp-server.md (content)
⚔️ docs/source/extend/plugins.md (content)
⚔️ docs/source/improve-workflows/evaluate.md (content)
⚔️ docs/source/improve-workflows/optimizer.md (content)
⚔️ docs/source/index.md (content)
⚔️ docs/source/reference/cli.md (content)
⚔️ docs/source/reference/rest-api/api-server-endpoints.md (content)
⚔️ docs/source/reference/rest-api/websockets.md (content)
⚔️ docs/source/release-notes.md (content)
⚔️ docs/source/run-workflows/a2a-server.md (content)
⚔️ docs/source/run-workflows/about-running-workflows.md (content)
⚔️ docs/source/run-workflows/mcp-server.md (content)
⚔️ docs/source/run-workflows/observe/observe-workflow-with-catalyst.md (content)
⚔️ docs/source/run-workflows/observe/observe.md (content)
⚔️ examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py (content)
⚔️ examples/MCP/simple_auth_mcp/configs/config-mcp-auth-jira-per-user.yml (content)
⚔️ examples/MCP/simple_calculator_mcp/README.md (content)
⚔️ examples/MCP/simple_calculator_mcp_protected/README.md (content)
⚔️ examples/README.md (content)
⚔️ examples/documentation_guides/tests/test_text_file_ingest.py (content)
⚔️ examples/dynamo_integration/README.md (content)
⚔️ examples/dynamo_integration/react_benchmark_agent/README.md (content)
⚔️ examples/observability/simple_calculator_observability/README.md (content)
⚔️ examples/safety_and_security/retail_agent/README.md (content)
⚔️ examples/safety_and_security/retail_agent/src/nat_retail_agent/configs/red-teaming-with-defenses.yml (content)
⚔️ examples/safety_and_security/retail_agent/src/nat_retail_agent/configs/red-teaming.yml (content)
⚔️ external/dynamo/README.md (content)
⚔️ external/nat-ui (content)
⚔️ packages/nvidia_nat_a2a/uv.lock (content)
⚔️ packages/nvidia_nat_adk/src/nat/plugins/adk/llm.py (content)
⚔️ packages/nvidia_nat_adk/tests/test_adk_llm.py (content)
⚔️ packages/nvidia_nat_adk/uv.lock (content)
⚔️ packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py (content)
⚔️ packages/nvidia_nat_agno/uv.lock (content)
⚔️ packages/nvidia_nat_autogen/src/nat/plugins/autogen/llm.py (content)
⚔️ packages/nvidia_nat_autogen/uv.lock (content)
⚔️ packages/nvidia_nat_core/pyproject.toml (content)
⚔️ packages/nvidia_nat_core/src/nat/authentication/oauth2/oauth2_resource_server_config.py (content)
⚔️ packages/nvidia_nat_core/src/nat/builder/builder.py (content)
⚔️ packages/nvidia_nat_core/src/nat/builder/context.py (content)
⚔️ packages/nvidia_nat_core/src/nat/builder/eval_builder.py (content)
⚔️ packages/nvidia_nat_core/src/nat/builder/function.py (content)
⚔️ packages/nvidia_nat_core/src/nat/cli/register_workflow.py (content)
⚔️ packages/nvidia_nat_core/src/nat/cli/type_registry.py (content)
⚔️ packages/nvidia_nat_core/src/nat/data_models/api_server.py (content)
⚔️ packages/nvidia_nat_core/src/nat/data_models/component.py (content)
⚔️ packages/nvidia_nat_core/src/nat/data_models/dataset_handler.py (content)
⚔️ packages/nvidia_nat_core/src/nat/data_models/evaluate.py (content)
⚔️ packages/nvidia_nat_core/src/nat/data_models/evaluator.py (content)
⚔️ packages/nvidia_nat_core/src/nat/data_models/interactive.py (content)
⚔️ packages/nvidia_nat_core/src/nat/data_models/retry_mixin.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/config.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/dataset_handler/dataset_downloader.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/dataset_handler/dataset_handler.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/evaluator/evaluator_model.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/rag_evaluator/register.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/red_teaming_evaluator/evaluate.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/red_teaming_evaluator/register.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/register.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/remote_workflow.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/runners/red_teaming_runner/config.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/trajectory_evaluator/evaluate.py (content)
⚔️ packages/nvidia_nat_core/src/nat/eval/trajectory_evaluator/register.py (content)
⚔️ packages/nvidia_nat_core/src/nat/front_ends/console/console_front_end_plugin.py (content)
⚔️ packages/nvidia_nat_core/src/nat/front_ends/fastapi/fastapi_front_end_config.py (content)
⚔️ packages/nvidia_nat_core/src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (content)
⚔️ packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py (content)
⚔️ packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_validator.py (content)
⚔️ packages/nvidia_nat_core/src/nat/front_ends/fastapi/response_helpers.py (content)
⚔️ packages/nvidia_nat_core/src/nat/llm/dynamo_llm.py (content)
⚔️ packages/nvidia_nat_core/src/nat/llm/utils/hooks.py (content)
⚔️ packages/nvidia_nat_core/src/nat/observability/exporter/span_exporter.py (content)
⚔️ packages/nvidia_nat_core/src/nat/retriever/milvus/register.py (content)
⚔️ packages/nvidia_nat_core/src/nat/retriever/milvus/retriever.py (content)
⚔️ packages/nvidia_nat_core/src/nat/runtime/runner.py (content)
⚔️ packages/nvidia_nat_core/src/nat/runtime/session.py (content)
⚔️ packages/nvidia_nat_core/src/nat/runtime/user_metadata.py (content)
⚔️ packages/nvidia_nat_core/src/nat/utils/io/yaml_tools.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/builder/test_function.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/builder/test_interactive.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/eval/red_teaming_evaluator/test_evaluate.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/eval/runners/red_teaming_runner/test_red_teaming_config.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/eval/runners/red_teaming_runner/test_red_teaming_runner.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/eval/test_evaluate.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/eval/test_remote_evaluate.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/eval/trajectory_evaluator/test_trajectory_evaluate.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/llm/test_dynamic_prediction_hook.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/llm/test_dynamo_llm.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/llm/test_runtime_prediction_e2e.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/llm/utils/test_thinking.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/observability/exporter/test_span_exporter.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/registry_handlers/test_local_handler.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/retriever/test_retrievers.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/runtime/test_runner_trace_ids.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/runtime/test_session_traceparent.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/runtime/test_user_metadata.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/server/test_unified_api_server.py (content)
⚔️ packages/nvidia_nat_core/tests/nat/utils/test_yaml_tools.py (content)
⚔️ packages/nvidia_nat_core/uv.lock (content)
⚔️ packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py (content)
⚔️ packages/nvidia_nat_crewai/uv.lock (content)
⚔️ packages/nvidia_nat_data_flywheel/uv.lock (content)
⚔️ packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/dual_node.py (content)
⚔️ packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py (content)
⚔️ packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py (content)
⚔️ packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py (content)
⚔️ packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (content)
⚔️ packages/nvidia_nat_langchain/src/nat/plugins/langchain/register.py (content)
⚔️ packages/nvidia_nat_langchain/tests/agent/test_react.py (content)
⚔️ packages/nvidia_nat_langchain/tests/agent/test_tool_calling.py (content)
⚔️ packages/nvidia_nat_langchain/tests/test_llm_langchain.py (content)
⚔️ packages/nvidia_nat_langchain/uv.lock (content)
⚔️ packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py (content)
⚔️ packages/nvidia_nat_llama_index/uv.lock (content)
⚔️ packages/nvidia_nat_mcp/src/nat/plugins/mcp/utils.py (content)
⚔️ packages/nvidia_nat_mcp/tests/server/test_mcp_client_endpoint.py (content)
⚔️ packages/nvidia_nat_mcp/uv.lock (content)
⚔️ packages/nvidia_nat_mem0ai/uv.lock (content)
⚔️ packages/nvidia_nat_mysql/uv.lock (content)
⚔️ packages/nvidia_nat_nemo_customizer/uv.lock (content)
⚔️ packages/nvidia_nat_openpipe_art/uv.lock (content)
⚔️ packages/nvidia_nat_opentelemetry/uv.lock (content)
⚔️ packages/nvidia_nat_phoenix/uv.lock (content)
⚔️ packages/nvidia_nat_ragaai/uv.lock (content)
⚔️ packages/nvidia_nat_redis/uv.lock (content)
⚔️ packages/nvidia_nat_s3/uv.lock (content)
⚔️ packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py (content)
⚔️ packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py (content)
⚔️ packages/nvidia_nat_semantic_kernel/uv.lock (content)
⚔️ packages/nvidia_nat_strands/src/nat/plugins/strands/llm.py (content)
⚔️ packages/nvidia_nat_strands/uv.lock (content)
⚔️ packages/nvidia_nat_test/src/nat/test/plugin.py (content)
⚔️ packages/nvidia_nat_test/uv.lock (content)
⚔️ packages/nvidia_nat_vanna/uv.lock (content)
⚔️ packages/nvidia_nat_weave/src/nat/plugins/weave/fastapi_plugin_worker.py (content)
⚔️ packages/nvidia_nat_weave/uv.lock (content)
⚔️ packages/nvidia_nat_zep_cloud/uv.lock (content)
⚔️ pyproject.toml (content)
⚔️ uv.lock (content)

These conflicts must be resolved before merging into develop.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main feature being added: token-by-token streaming for the tool_calling_agent, using appropriate imperative mood and staying well within the 72-character limit.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@willkill07 willkill07 added improvement Improvement to existing functionality non-breaking Non-breaking change labels Feb 12, 2026
@willkill07 willkill07 self-assigned this Feb 12, 2026
@MylesShannon MylesShannon force-pushed the feature/tool-calling-agent-streaming branch from 9f827f9 to 973f223 Compare February 12, 2026 16:23
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/nvidia_nat_langchain/tests/agent/test_tool_calling.py (1)

126-166: ⚠️ Potential issue | 🟠 Major

Add tests for the _stream_fn streaming path.

The _stream_fn function in register.py (lines 107–136) implements token-by-token streaming but has no test coverage. Add tests to verify:

  • Yields only AIMessageChunk content from the "agent" node
  • Skips tool-call chunks (msg.tool_call_chunks)
  • Error handling and exception propagation
🧹 Nitpick comments (2)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py (1)

103-103: In-place mutation of the caller's config dict.

config["configurable"] = ... mutates the dict passed by the caller. In the LangGraph runtime this is typically safe since configs are created per invocation, but direct callers (including tests) could be surprised if they reuse the same dict. Consider working with a shallow copy to be defensive.

♻️ Defensive copy
-            config["configurable"] = {**(config.get("configurable") or {}), "__pregel_runtime": DEFAULT_RUNTIME}
+            config = {**config, "configurable": {**(config.get("configurable") or {}), "__pregel_runtime": DEFAULT_RUNTIME}}
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py (1)

126-162: Duplicated state-initialization logic with _response_fn.

Lines 138-148 are nearly identical to lines 100-110 in _response_fn (convert input, trim_messages, build ToolCallAgentGraphState). Extract a shared helper to reduce drift risk between the two paths.

♻️ Extract shared helper
+    def _build_initial_state(chat_request_or_message: ChatRequestOrMessage) -> ToolCallAgentGraphState:
+        chat_request = GlobalTypeConverter.get().convert(chat_request_or_message, to_type=ChatRequest)
+        messages: list[BaseMessage] = trim_messages(
+            messages=[m.model_dump() for m in chat_request.messages],
+            max_tokens=config.max_history,
+            strategy="last",
+            token_counter=len,
+            start_on="human",
+            include_system=True,
+        )
+        return ToolCallAgentGraphState(messages=messages)
+
     async def _response_fn(chat_request_or_message: ChatRequestOrMessage) -> str:
         ...
         try:
-            message = GlobalTypeConverter.get().convert(chat_request_or_message, to_type=ChatRequest)
-
-            # initialize the starting state with the user query
-            messages: list[BaseMessage] = trim_messages(messages=[m.model_dump() for m in message.messages],
-                                                        max_tokens=config.max_history,
-                                                        strategy="last",
-                                                        token_counter=len,
-                                                        start_on="human",
-                                                        include_system=True)
-            state = ToolCallAgentGraphState(messages=messages)
+            state = _build_initial_state(chat_request_or_message)
             ...

Apply the same replacement inside _stream_fn.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py (1)

170-170: ⚠️ Potential issue | 🟡 Minor

Stale log message references "react_agent" instead of "tool_calling_agent".

This appears to be a copy-paste artifact. The cleanup log should reference the correct workflow name.

Proposed fix
-        logger.debug("%s Cleaning up react_agent workflow.", AGENT_LOG_PREFIX)
+        logger.debug("%s Cleaning up tool_calling_agent workflow.", AGENT_LOG_PREFIX)
🤖 Fix all issues with AI agents
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py`:
- Line 98: The public async method agent_node on the ToolCallAgentGraphState
flow is missing a return type annotation; open the async def agent_node(self,
state: ToolCallAgentGraphState, config: RunnableConfig) implementation,
determine the actual value it returns (or that it returns nothing), and add an
explicit return type hint (e.g., -> None if it returns nothing, or ->
Awaitable[ActualType] / -> Coroutine[Any, Any, ActualType] / -> ActualType if
synchronous) using typing imports as needed; ensure the annotation references
the real return type rather than omitting typing (update imports for
Any/Awaitable/Coroutine if used).
- Line 103: The code currently mutates the caller's config dict in place by
assigning config["configurable"] = ... which can leak state; instead create a
shallow copy of the config (or the configurable sub-dict) and set the
"__pregel_runtime" there, or call LangGraph's merge_configs / patch_config
helper with DEFAULT_RUNTIME to produce a new config object; specifically, avoid
modifying the passed-in config variable and update the call site to use the
new_config (or merged result) so the original config, DEFAULT_RUNTIME, and the
"__pregel_runtime" key are handled without in-place mutation.
🧹 Nitpick comments (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py (1)

126-162: Duplicated message-conversion and state-initialization logic.

Lines 138–148 are nearly identical to lines 100–110 in _response_fn. Consider extracting a small helper (e.g., _prepare_state(chat_request_or_message)) to DRY this up, which also ensures any future changes to trimming or conversion are applied consistently.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Line 158: The agent_node method currently accepts a config: RunnableConfig but
never uses it; fix by merging/forwarding the incoming config into the
RunnableConfig instances you build inside agent_node (instead of constructing
fresh ones), so runtime callbacks (e.g., streaming callbacks from LangGraph)
propagate into the ReAct path. Locate the two spots where you create new
RunnableConfig objects inside agent_node and replace those with merged configs
(e.g., new_config = existing_runnable_config.merge(config) or use the
appropriate RunnableConfig.merge/with_* helper used by the tool-calling agent)
so that callbacks and other runtime options from the incoming config are
preserved.

In `@packages/nvidia_nat_langchain/tests/agent/test_react.py`:
- Line 110: The test line calling mock_react_agent_no_raise.agent_node is longer
than the 120-char yapf limit; split the call across multiple lines to respect
column_limit by e.g. creating the ReActGraphState and/or config as separate
variables or formatting the await call with each argument on its own line so
that mock_react_agent_no_raise.agent_node(...),
ReActGraphState(messages=[HumanMessage('hi')]) and config={"configurable": {}}
are wrapped across lines; update the lines around the call to agent_node (the
test using mock_react_agent_no_raise.agent_node and
ReActGraphState/HumanMessage) similarly for the other long lines (1066, 1087,
1097).
🧹 Nitpick comments (1)
packages/nvidia_nat_langchain/tests/agent/test_react.py (1)

105-105: Consider extracting the repeated config dict into a fixture or module-level constant.

The literal config={"configurable": {}} is repeated across ~20 test call sites. A shared constant or fixture would reduce duplication and make it easier to update if the default config shape changes.

Example
# At module level or as a fixture:
EMPTY_CONFIG = {"configurable": {}}

# Then in tests:
await mock_react_agent.agent_node(state, config=EMPTY_CONFIG)

Also applies to: 110-110, 121-121

@MylesShannon MylesShannon force-pushed the feature/tool-calling-agent-streaming branch 3 times, most recently from 7d42c02 to 38bb137 Compare February 18, 2026 17:39
Enable real-time SSE streaming for the tool_calling_agent via the
OpenAI-compatible /v1/chat/completions endpoint.

Changes:
- Add _stream_fn using graph.astream(stream_mode="messages") to yield
  individual LLM tokens, registered via FunctionInfo.create()
- Switch agent_node from ainvoke to astream so LangGraph's message
  streaming hooks can observe individual tokens from the LLM
- Fix ChatResponseChunk.from_string() to use finish_reason=None
  (OpenAI spec: only the final chunk should have finish_reason="stop")
- Add final stop chunk and data: [DONE] sentinel to SSE stream
- Add test for graph.astream message streaming path

Signed-off-by: Myles Shannon <mshannon@nvidia.com>
@MylesShannon MylesShannon force-pushed the feature/tool-calling-agent-streaming branch from 38bb137 to 154b826 Compare February 18, 2026 17:43
@willkill07
Copy link
Member

/ok to test 154b826

Copy link
Member

@willkill07 willkill07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes, but otherwise LGTM!

- Add optional finish_reason param to ChatResponseChunk.from_string()
  with FINISH_REASONS validation
- Move AIMessageChunk import to top of tool_calling_agent_workflow
- Remove unnecessary try/except/finally around FunctionInfo.create yield

Signed-off-by: Myles Shannon <mshannon@nvidia.com>
@willkill07
Copy link
Member

/ok to test 8869d68

@willkill07
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit c044089 into NVIDIA:develop Feb 18, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement to existing functionality non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments