Skip to content

Sandbox nested persona template rendering in evaluation prompts#5266

Open
petrmarinec wants to merge 2 commits intogoogle:mainfrom
petrmarinec:fix-jinja-ssti-sandbox
Open

Sandbox nested persona template rendering in evaluation prompts#5266
petrmarinec wants to merge 2 commits intogoogle:mainfrom
petrmarinec:fix-jinja-ssti-sandbox

Conversation

@petrmarinec
Copy link
Copy Markdown

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

2. Or, if no issue exists, describe the change:

Problem:
Nested persona behavior strings were rendered through render_string_filter, which created a fresh Jinja template from persona-controlled content instead of consistently reusing a sandboxed environment. That allowed nested persona templates to execute outside the intended sandbox boundary in evaluation prompt construction.

Affected files:

  • src/google/adk/evaluation/simulation/llm_backed_user_simulator_prompts.py
  • src/google/adk/evaluation/simulation/per_turn_user_simulator_quality_prompts.py

Solution:
Render nested persona strings through SandboxedEnvironment and use SandboxedEnvironment for the per-turn evaluator prompt builder as well. This keeps supported nested placeholders such as {{ stop_signal }} working while blocking unsafe nested template access. Added regression tests for both the safe interpolation path and blocked unsafe attribute traversal.

Testing Plan

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

Passed in clean Linux Docker (python:3.11-bookworm):

  • uv sync --all-extras
  • pytest tests/unittests/evaluation/simulation
  • Result: 70 passed

Additional repo-wide validation:

  • pytest tests/unittests
  • Result on patched branch: 5326 passed, 1 skipped, 5 failed
  • The same 5 failures reproduce on unmodified origin/main
  • Those failures are unrelated tests/unittests/tools/test_skill_toolset.py integration timeouts

Manual End-to-End (E2E) Tests:

  • Ran a live adk web regression test against the eval API in Linux Docker using a local non-LLM root agent.
  • Malicious nested persona template {{ ''.__class__.__mro__ }} was blocked during prompt construction with jinja2.exceptions.SecurityError.
  • A safe persona using nested {{ stop_signal }} placeholders did not raise TemplateSyntaxError or SecurityError and progressed beyond prompt rendering into a real Gemini model call.
  • The safe run did not fully complete because the test key hit 429 RESOURCE_EXHAUSTED, but the absence of template errors and the subsequent model call confirm the sandboxed nested rendering path is functioning as intended.

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end.
  • Any dependent changes have been merged and published in downstream modules.

Additional context

This is a small, focused fix intended to close the nested-template sandbox bypass path without changing the supported nested placeholder behavior used by existing personas.

@adk-bot adk-bot added the eval [Component] This issue is related to evaluation label Apr 10, 2026
@petrmarinec
Copy link
Copy Markdown
Author

For clarity, I also reran the remaining failing full-suite tests on unmodified origin/main in the same Docker-based setup.

The only full-suite failures I saw were these 5 test_skill_toolset integration tests:

  • tests/unittests/tools/test_skill_toolset.py::test_integration_python_stdout
  • tests/unittests/tools/test_skill_toolset.py::test_integration_python_sys_exit_zero
  • tests/unittests/tools/test_skill_toolset.py::test_integration_shell_stdout_and_stderr
  • tests/unittests/tools/test_skill_toolset.py::test_integration_shell_stderr_only
  • tests/unittests/tools/test_skill_toolset.py::test_integration_shell_nonzero_exit

Those same 5 failures reproduce on unmodified origin/main, so they are not introduced by this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

eval [Component] This issue is related to evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants