Skip to content

Sandbox nested persona template rendering in evaluation prompt builders #5265

@petrmarinec

Description

@petrmarinec

Summary

Nested persona behavior strings are rendered as Jinja templates during evaluation prompt construction. The affected code paths did not consistently keep that nested rendering inside a SandboxedEnvironment.

Affected code

  • src/google/adk/evaluation/simulation/llm_backed_user_simulator_prompts.py
  • src/google/adk/evaluation/simulation/per_turn_user_simulator_quality_prompts.py

Problem

Both prompt builders support nested rendering of persona behavior fields through render_string_filter.

In llm_backed_user_simulator_prompts.py, the outer prompt used SandboxedEnvironment, but nested string rendering created a new template from the persona string instead of reusing the sandboxed environment.

In per_turn_user_simulator_quality_prompts.py, nested persona rendering followed the same pattern and the outer environment was not sandboxed.

As a result, nested persona strings could evaluate Jinja expressions outside the intended sandbox boundary.

Expected behavior

Nested persona strings should only render through a sandboxed Jinja environment, so supported placeholders such as {{ stop_signal }} still work but unsafe attribute traversal and similar sandbox escapes do not.

Proposed fix

  • Reuse SandboxedEnvironment for nested persona rendering instead of creating a fresh unsandboxed template.
  • Use SandboxedEnvironment in the per-turn evaluator prompt builder as well.
  • Add regression tests that verify:
    • supported nested placeholders still render
    • unsafe nested template expressions are blocked

Validation

I have a PR prepared that:

  • switches both prompt builders to sandboxed nested rendering
  • adds regression tests for allowed interpolation and blocked unsafe access
  • passes tests/unittests/evaluation/simulation in clean Linux Docker
  • confirms in a live adk web eval run that:
    • malicious nested persona templates are blocked with SecurityError
    • safe nested placeholders proceed past prompt rendering into normal model execution

Metadata

Metadata

Assignees

Labels

eval[Component] This issue is related to evaluation

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions