feat(agent): add token-by-token streaming to tool_calling_agent by MylesShannon · Pull Request #1595 · NVIDIA/NeMo-Agent-Toolkit

MylesShannon · 2026-02-12T16:18:47Z

Description

This PR enables token-by-token streaming for the Tool Calling Agent when stream=true is set on chat completion requests. Previously, streaming requests returned the full response in a single Server-Sent Event; responses are now streamed incrementally as LLM tokens are produced.

Changes:

register.py: Added a streaming entry point _stream_fn that uses LangGraph astream(stream_mode="messages") to consume message chunks from the graph and yield only AIMessageChunk content from the agent node (excluding tool-call chunks). The Tool Calling Agent is now registered with FunctionInfo.create(single_fn=_response_fn, stream_fn=_stream_fn, ...) so both single-call and streaming are supported.
agent.py: The agent node now accepts RunnableConfig, forwards it to the underlying LLM, and uses self.agent.astream() instead of ainvoke() so the LLM streams. Chunks are accumulated into the final AIMessage for non-streaming paths, and config is merged so streaming callbacks propagate correctly.
Tests: test_tool_calling.py updated so agent_node is called with a mock config argument to match the new signature.

No new user-facing documentation was added; existing docstrings and inline comments were updated to describe the streaming behavior.

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

New Features
- Real-time streaming for the tool-calling workflow, delivering incremental token content via a streaming entry.
Refactor
- Agent interfaces now accept an explicit runtime/config parameter to control execution and propagate runtime settings.
Bug Fixes
- Streamed chunks are concatenated into a single response, appended to message state, and an error is raised if no output is produced.
Tests
- Tests updated to cover config-aware calls and streaming behavior.

copy-pr-bot · 2026-02-12T16:18:51Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-12T16:20:24Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

agent_node signatures extended to accept a RunnableConfig; tool-calling now streams via graph.astream, accumulates AIMessageChunk content into a single response appended to state.messages. Registration gained an async streaming function path and tests updated to pass the new config parameter.

Changes

Cohort / File(s)	Summary
Core Tool-Calling Agent `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py`	`agent_node` signature now accepts `config: RunnableConfig`. Merges/augments incoming config (injects `__pregel_runtime`), uses `agent.astream(..., config=run_config, stream_mode="messages")` to stream `AIMessageChunk`s, accumulates chunks into a single response, validates non-empty output, logs when enabled, appends final response to `state.messages`.
Registration & Streaming Path `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py`	Added `AsyncGenerator` typing and a `_stream_fn` that yields token/content chunks from `graph.astream` (filters `AIMessageChunk` with agent-directed metadata, logs and re-raises on errors). Replaced `FunctionInfo.from_fn(...)` with `FunctionInfo.create(single_fn=..., stream_fn=..., description=...)`.
Agent Interface Propagation `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/dual_node.py`, `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`	Abstract and concrete `agent_node` method signatures updated to accept `config: RunnableConfig`, and internal calls now build/propagate a consolidated `run_config` (injecting `__pregel_runtime`).
Tests — Tool Calling `packages/nvidia_nat_langchain/tests/agent/test_tool_calling.py`	Test call sites updated to pass the new `config` argument (e.g., `config={"configurable": {}}`) into `ToolCallAgentGraph.agent_node`.
Tests — ReAct Agent `packages/nvidia_nat_langchain/tests/agent/test_react.py`	Updated test invocations to pass the new `config` argument into `ReActAgentGraph.agent_node` and related call paths.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant Agent as ToolCallAgentGraph.agent_node
    participant Graph as Graph.astream
    participant State as state/messages

    Caller->>Agent: call agent_node(state, config)
    Agent->>Agent: merge/augment config (inject __pregel_runtime)
    Agent->>Graph: call astream(..., config=run_config, stream_mode="messages")
    loop per streamed chunk
        Graph-->>Agent: AIMessageChunk (agent-directed) content
        Agent->>Agent: accumulate chunk into response
    end
    Agent->>Agent: validate response non-empty
    Agent->>State: append final response to state.messages
    Agent->>Caller: return updated state

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 34.38% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection	⚠️ Warning	❌ Merge conflicts detected (154 files): ⚔️ `.github/workflows/ci_pipe.yml` (content) ⚔️ `.gitlab-ci.yml` (content) ⚔️ `CHANGELOG.md` (content) ⚔️ `ci/scripts/common.sh` (content) ⚔️ `ci/scripts/github/build_wheel.sh` (content) ⚔️ `ci/scripts/gitlab/build_wheel.sh` (content) ⚔️ `ci/scripts/license_diff.py` (content) ⚔️ `ci/scripts/sbom_list.py` (content) ⚔️ `docs/source/build-workflows/a2a-client.md` (content) ⚔️ `docs/source/build-workflows/mcp-client.md` (content) ⚔️ `docs/source/build-workflows/workflow-configuration.md` (content) ⚔️ `docs/source/components/auth/mcp-auth/index.md` (content) ⚔️ `docs/source/components/integrations/a2a.md` (content) ⚔️ `docs/source/extend/custom-components/custom-functions/per-user-functions.md` (content) ⚔️ `docs/source/extend/custom-components/index.md` (content) ⚔️ `docs/source/extend/custom-components/mcp-server.md` (content) ⚔️ `docs/source/extend/plugins.md` (content) ⚔️ `docs/source/improve-workflows/evaluate.md` (content) ⚔️ `docs/source/improve-workflows/optimizer.md` (content) ⚔️ `docs/source/index.md` (content) ⚔️ `docs/source/reference/cli.md` (content) ⚔️ `docs/source/reference/rest-api/api-server-endpoints.md` (content) ⚔️ `docs/source/reference/rest-api/websockets.md` (content) ⚔️ `docs/source/release-notes.md` (content) ⚔️ `docs/source/run-workflows/a2a-server.md` (content) ⚔️ `docs/source/run-workflows/about-running-workflows.md` (content) ⚔️ `docs/source/run-workflows/mcp-server.md` (content) ⚔️ `docs/source/run-workflows/observe/observe-workflow-with-catalyst.md` (content) ⚔️ `docs/source/run-workflows/observe/observe.md` (content) ⚔️ `examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py` (content) ⚔️ `examples/MCP/simple_auth_mcp/configs/config-mcp-auth-jira-per-user.yml` (content) ⚔️ `examples/MCP/simple_calculator_mcp/README.md` (content) ⚔️ `examples/MCP/simple_calculator_mcp_protected/README.md` (content) ⚔️ `examples/README.md` (content) ⚔️ `examples/documentation_guides/tests/test_text_file_ingest.py` (content) ⚔️ `examples/dynamo_integration/README.md` (content) ⚔️ `examples/dynamo_integration/react_benchmark_agent/README.md` (content) ⚔️ `examples/observability/simple_calculator_observability/README.md` (content) ⚔️ `examples/safety_and_security/retail_agent/README.md` (content) ⚔️ `examples/safety_and_security/retail_agent/src/nat_retail_agent/configs/red-teaming-with-defenses.yml` (content) ⚔️ `examples/safety_and_security/retail_agent/src/nat_retail_agent/configs/red-teaming.yml` (content) ⚔️ `external/dynamo/README.md` (content) ⚔️ `external/nat-ui` (content) ⚔️ `packages/nvidia_nat_a2a/uv.lock` (content) ⚔️ `packages/nvidia_nat_adk/src/nat/plugins/adk/llm.py` (content) ⚔️ `packages/nvidia_nat_adk/tests/test_adk_llm.py` (content) ⚔️ `packages/nvidia_nat_adk/uv.lock` (content) ⚔️ `packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py` (content) ⚔️ `packages/nvidia_nat_agno/uv.lock` (content) ⚔️ `packages/nvidia_nat_autogen/src/nat/plugins/autogen/llm.py` (content) ⚔️ `packages/nvidia_nat_autogen/uv.lock` (content) ⚔️ `packages/nvidia_nat_core/pyproject.toml` (content) ⚔️ `packages/nvidia_nat_core/src/nat/authentication/oauth2/oauth2_resource_server_config.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/builder/builder.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/builder/context.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/builder/eval_builder.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/builder/function.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/cli/register_workflow.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/cli/type_registry.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/data_models/api_server.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/data_models/component.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/data_models/dataset_handler.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/data_models/evaluate.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/data_models/evaluator.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/data_models/interactive.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/data_models/retry_mixin.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/config.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/dataset_handler/dataset_downloader.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/dataset_handler/dataset_handler.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/evaluator/evaluator_model.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/rag_evaluator/register.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/red_teaming_evaluator/evaluate.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/red_teaming_evaluator/register.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/register.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/remote_workflow.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/runners/red_teaming_runner/config.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/trajectory_evaluator/evaluate.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/eval/trajectory_evaluator/register.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/front_ends/console/console_front_end_plugin.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/front_ends/fastapi/fastapi_front_end_config.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_handler.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/front_ends/fastapi/message_validator.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/front_ends/fastapi/response_helpers.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/llm/dynamo_llm.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/llm/utils/hooks.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/observability/exporter/span_exporter.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/retriever/milvus/register.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/retriever/milvus/retriever.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/runtime/runner.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/runtime/session.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/runtime/user_metadata.py` (content) ⚔️ `packages/nvidia_nat_core/src/nat/utils/io/yaml_tools.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/builder/test_function.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/builder/test_interactive.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/eval/red_teaming_evaluator/test_evaluate.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/eval/runners/red_teaming_runner/test_red_teaming_config.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/eval/runners/red_teaming_runner/test_red_teaming_runner.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/eval/test_evaluate.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/eval/test_remote_evaluate.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/eval/trajectory_evaluator/test_trajectory_evaluate.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/llm/test_dynamic_prediction_hook.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/llm/test_dynamo_llm.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/llm/test_runtime_prediction_e2e.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/llm/utils/test_thinking.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/observability/exporter/test_span_exporter.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/registry_handlers/test_local_handler.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/retriever/test_retrievers.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/runtime/test_runner_trace_ids.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/runtime/test_session_traceparent.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/runtime/test_user_metadata.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/server/test_unified_api_server.py` (content) ⚔️ `packages/nvidia_nat_core/tests/nat/utils/test_yaml_tools.py` (content) ⚔️ `packages/nvidia_nat_core/uv.lock` (content) ⚔️ `packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py` (content) ⚔️ `packages/nvidia_nat_crewai/uv.lock` (content) ⚔️ `packages/nvidia_nat_data_flywheel/uv.lock` (content) ⚔️ `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/dual_node.py` (content) ⚔️ `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py` (content) ⚔️ `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py` (content) ⚔️ `packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py` (content) ⚔️ `packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py` (content) ⚔️ `packages/nvidia_nat_langchain/src/nat/plugins/langchain/register.py` (content) ⚔️ `packages/nvidia_nat_langchain/tests/agent/test_react.py` (content) ⚔️ `packages/nvidia_nat_langchain/tests/agent/test_tool_calling.py` (content) ⚔️ `packages/nvidia_nat_langchain/tests/test_llm_langchain.py` (content) ⚔️ `packages/nvidia_nat_langchain/uv.lock` (content) ⚔️ `packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py` (content) ⚔️ `packages/nvidia_nat_llama_index/uv.lock` (content) ⚔️ `packages/nvidia_nat_mcp/src/nat/plugins/mcp/utils.py` (content) ⚔️ `packages/nvidia_nat_mcp/tests/server/test_mcp_client_endpoint.py` (content) ⚔️ `packages/nvidia_nat_mcp/uv.lock` (content) ⚔️ `packages/nvidia_nat_mem0ai/uv.lock` (content) ⚔️ `packages/nvidia_nat_mysql/uv.lock` (content) ⚔️ `packages/nvidia_nat_nemo_customizer/uv.lock` (content) ⚔️ `packages/nvidia_nat_openpipe_art/uv.lock` (content) ⚔️ `packages/nvidia_nat_opentelemetry/uv.lock` (content) ⚔️ `packages/nvidia_nat_phoenix/uv.lock` (content) ⚔️ `packages/nvidia_nat_ragaai/uv.lock` (content) ⚔️ `packages/nvidia_nat_redis/uv.lock` (content) ⚔️ `packages/nvidia_nat_s3/uv.lock` (content) ⚔️ `packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py` (content) ⚔️ `packages/nvidia_nat_semantic_kernel/tests/test_llm_sk.py` (content) ⚔️ `packages/nvidia_nat_semantic_kernel/uv.lock` (content) ⚔️ `packages/nvidia_nat_strands/src/nat/plugins/strands/llm.py` (content) ⚔️ `packages/nvidia_nat_strands/uv.lock` (content) ⚔️ `packages/nvidia_nat_test/src/nat/test/plugin.py` (content) ⚔️ `packages/nvidia_nat_test/uv.lock` (content) ⚔️ `packages/nvidia_nat_vanna/uv.lock` (content) ⚔️ `packages/nvidia_nat_weave/src/nat/plugins/weave/fastapi_plugin_worker.py` (content) ⚔️ `packages/nvidia_nat_weave/uv.lock` (content) ⚔️ `packages/nvidia_nat_zep_cloud/uv.lock` (content) ⚔️ `pyproject.toml` (content) ⚔️ `uv.lock` (content) These conflicts must be resolved before merging into `develop`.	Resolve conflicts locally and push changes to this branch.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main feature being added: token-by-token streaming for the tool_calling_agent, using appropriate imperative mood and staying well within the 72-character limit.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/nvidia_nat_langchain/tests/agent/test_tool_calling.py (1)

126-166: ⚠️ Potential issue | 🟠 Major

Add tests for the _stream_fn streaming path.

The _stream_fn function in register.py (lines 107–136) implements token-by-token streaming but has no test coverage. Add tests to verify:

Yields only AIMessageChunk content from the "agent" node

Skips tool-call chunks (msg.tool_call_chunks)

Error handling and exception propagation

🧹 Nitpick comments (2)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py (1)

103-103: In-place mutation of the caller's config dict.

config["configurable"] = ... mutates the dict passed by the caller. In the LangGraph runtime this is typically safe since configs are created per invocation, but direct callers (including tests) could be surprised if they reuse the same dict. Consider working with a shallow copy to be defensive.
♻️ Defensive copy
-            config["configurable"] = {**(config.get("configurable") or {}), "__pregel_runtime": DEFAULT_RUNTIME}
+            config = {**config, "configurable": {**(config.get("configurable") or {}), "__pregel_runtime": DEFAULT_RUNTIME}}

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py (1)

126-162: Duplicated state-initialization logic with _response_fn.

Lines 138-148 are nearly identical to lines 100-110 in _response_fn (convert input, trim_messages, build ToolCallAgentGraphState). Extract a shared helper to reduce drift risk between the two paths.

♻️ Extract shared helper

+    def _build_initial_state(chat_request_or_message: ChatRequestOrMessage) -> ToolCallAgentGraphState:
+        chat_request = GlobalTypeConverter.get().convert(chat_request_or_message, to_type=ChatRequest)
+        messages: list[BaseMessage] = trim_messages(
+            messages=[m.model_dump() for m in chat_request.messages],
+            max_tokens=config.max_history,
+            strategy="last",
+            token_counter=len,
+            start_on="human",
+            include_system=True,
+        )
+        return ToolCallAgentGraphState(messages=messages)
+
     async def _response_fn(chat_request_or_message: ChatRequestOrMessage) -> str:
         ...
         try:
-            message = GlobalTypeConverter.get().convert(chat_request_or_message, to_type=ChatRequest)
-
-            # initialize the starting state with the user query
-            messages: list[BaseMessage] = trim_messages(messages=[m.model_dump() for m in message.messages],
-                                                        max_tokens=config.max_history,
-                                                        strategy="last",
-                                                        token_counter=len,
-                                                        start_on="human",
-                                                        include_system=True)
-            state = ToolCallAgentGraphState(messages=messages)
+            state = _build_initial_state(chat_request_or_message)
             ...

Apply the same replacement inside _stream_fn.

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py (1)
170-170: ⚠️ Potential issue | 🟡 Minor

Stale log message references "react_agent" instead of "tool_calling_agent".

This appears to be a copy-paste artifact. The cleanup log should reference the correct workflow name.
Proposed fix
-        logger.debug("%s Cleaning up react_agent workflow.", AGENT_LOG_PREFIX)
+        logger.debug("%s Cleaning up tool_calling_agent workflow.", AGENT_LOG_PREFIX)

🤖 Fix all issues with AI agents

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py`:
- Line 98: The public async method agent_node on the ToolCallAgentGraphState
flow is missing a return type annotation; open the async def agent_node(self,
state: ToolCallAgentGraphState, config: RunnableConfig) implementation,
determine the actual value it returns (or that it returns nothing), and add an
explicit return type hint (e.g., -> None if it returns nothing, or ->
Awaitable[ActualType] / -> Coroutine[Any, Any, ActualType] / -> ActualType if
synchronous) using typing imports as needed; ensure the annotation references
the real return type rather than omitting typing (update imports for
Any/Awaitable/Coroutine if used).
- Line 103: The code currently mutates the caller's config dict in place by
assigning config["configurable"] = ... which can leak state; instead create a
shallow copy of the config (or the configurable sub-dict) and set the
"__pregel_runtime" there, or call LangGraph's merge_configs / patch_config
helper with DEFAULT_RUNTIME to produce a new config object; specifically, avoid
modifying the passed-in config variable and update the call site to use the
new_config (or merged result) so the original config, DEFAULT_RUNTIME, and the
"__pregel_runtime" key are handled without in-place mutation.

🧹 Nitpick comments (1)

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py (1)

126-162: Duplicated message-conversion and state-initialization logic.

Lines 138–148 are nearly identical to lines 100–110 in _response_fn. Consider extracting a small helper (e.g., _prepare_state(chat_request_or_message)) to DRY this up, which also ensures any future changes to trimming or conversion are applied consistently.

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py`:
- Line 158: The agent_node method currently accepts a config: RunnableConfig but
never uses it; fix by merging/forwarding the incoming config into the
RunnableConfig instances you build inside agent_node (instead of constructing
fresh ones), so runtime callbacks (e.g., streaming callbacks from LangGraph)
propagate into the ReAct path. Locate the two spots where you create new
RunnableConfig objects inside agent_node and replace those with merged configs
(e.g., new_config = existing_runnable_config.merge(config) or use the
appropriate RunnableConfig.merge/with_* helper used by the tool-calling agent)
so that callbacks and other runtime options from the incoming config are
preserved.

In `@packages/nvidia_nat_langchain/tests/agent/test_react.py`:
- Line 110: The test line calling mock_react_agent_no_raise.agent_node is longer
than the 120-char yapf limit; split the call across multiple lines to respect
column_limit by e.g. creating the ReActGraphState and/or config as separate
variables or formatting the await call with each argument on its own line so
that mock_react_agent_no_raise.agent_node(...),
ReActGraphState(messages=[HumanMessage('hi')]) and config={"configurable": {}}
are wrapped across lines; update the lines around the call to agent_node (the
test using mock_react_agent_no_raise.agent_node and
ReActGraphState/HumanMessage) similarly for the other long lines (1066, 1087,
1097).

🧹 Nitpick comments (1)

packages/nvidia_nat_langchain/tests/agent/test_react.py (1)
105-105: Consider extracting the repeated config dict into a fixture or module-level constant.

The literal config={"configurable": {}} is repeated across ~20 test call sites. A shared constant or fixture would reduce duplication and make it easier to update if the default config shape changes.
Example
# At module level or as a fixture:
EMPTY_CONFIG = {"configurable": {}}

# Then in tests:
await mock_react_agent.agent_node(state, config=EMPTY_CONFIG)
Also applies to: 110-110, 121-121

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py

packages/nvidia_nat_langchain/tests/agent/test_react.py

Enable real-time SSE streaming for the tool_calling_agent via the OpenAI-compatible /v1/chat/completions endpoint. Changes: - Add _stream_fn using graph.astream(stream_mode="messages") to yield individual LLM tokens, registered via FunctionInfo.create() - Switch agent_node from ainvoke to astream so LangGraph's message streaming hooks can observe individual tokens from the LLM - Fix ChatResponseChunk.from_string() to use finish_reason=None (OpenAI spec: only the final chunk should have finish_reason="stop") - Add final stop chunk and data: [DONE] sentinel to SSE stream - Add test for graph.astream message streaming path Signed-off-by: Myles Shannon <mshannon@nvidia.com>

willkill07 · 2026-02-18T18:25:10Z

/ok to test 154b826

willkill07

Minor changes, but otherwise LGTM!

packages/nvidia_nat_core/src/nat/data_models/api_server.py

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/register.py

- Add optional finish_reason param to ChatResponseChunk.from_string() with FINISH_REASONS validation - Move AIMessageChunk import to top of tool_calling_agent_workflow - Remove unnecessary try/except/finally around FunctionInfo.create yield Signed-off-by: Myles Shannon <mshannon@nvidia.com>

willkill07 · 2026-02-18T21:10:25Z

/ok to test 8869d68

willkill07 · 2026-02-18T22:03:19Z

/merge

MylesShannon requested a review from a team as a code owner February 12, 2026 16:18

willkill07 added improvement Improvement to existing functionality non-breaking Non-breaking change labels Feb 12, 2026

willkill07 self-assigned this Feb 12, 2026

MylesShannon force-pushed the feature/tool-calling-agent-streaming branch from 9f827f9 to 973f223 Compare February 12, 2026 16:23

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

willkill07 requested changes Feb 12, 2026

View reviewed changes

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py Outdated Show resolved Hide resolved

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/tool_calling_agent/agent.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 13, 2026

View reviewed changes

packages/nvidia_nat_langchain/src/nat/plugins/langchain/agent/react_agent/agent.py Outdated Show resolved Hide resolved

packages/nvidia_nat_langchain/tests/agent/test_react.py Outdated Show resolved Hide resolved

MylesShannon force-pushed the feature/tool-calling-agent-streaming branch 3 times, most recently from 7d42c02 to 38bb137 Compare February 18, 2026 17:39

MylesShannon force-pushed the feature/tool-calling-agent-streaming branch from 38bb137 to 154b826 Compare February 18, 2026 17:43

willkill07 requested changes Feb 18, 2026

View reviewed changes

MylesShannon requested a review from willkill07 February 18, 2026 20:53

willkill07 approved these changes Feb 18, 2026

View reviewed changes

rapids-bot bot merged commit c044089 into NVIDIA:develop Feb 18, 2026
17 checks passed

Conversation

MylesShannon commented Feb 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Feb 12, 2026

Uh oh!

coderabbitai bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

willkill07 commented Feb 18, 2026

Uh oh!

willkill07 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

willkill07 commented Feb 18, 2026

Uh oh!

willkill07 commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

MylesShannon commented Feb 12, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 12, 2026 •

edited

Loading