Skip to content

refactor: introduce AgentCliGenerator base to unify CLI generator handling#415

Open
omkargaikwad23 wants to merge 4 commits into
mainfrom
AgentCliGenerator
Open

refactor: introduce AgentCliGenerator base to unify CLI generator handling#415
omkargaikwad23 wants to merge 4 commits into
mainfrom
AgentCliGenerator

Conversation

@omkargaikwad23
Copy link
Copy Markdown
Collaborator

@omkargaikwad23 omkargaikwad23 commented Jun 5, 2026

Summary

  • Add AgentCliGenerator base class (ABC) that the four CLI generators
    (gemini_cli, claude_code, codex_cli, agy_cli) now inherit from, declaring
    the shared interface: version, create_command, safe_generate,
    parse_response, extract_tools, extract_skills.
  • Unify create_command to one signature across all four generators so the
    evaluator no longer branches on concrete type.
  • Simplify AgentEvaluator: collapse the four isinstance(...4-tuple...)
    checks to a single isinstance(self.generator, AgentCliGenerator), replace
    the per-type version chain with self.generator.version, and drop all
    concrete CLI imports.

Why

Each new CLI generator previously required edits at ~6 sites in the evaluator.
Keying off the base class makes adding a generator zero-touch in the evaluator,
and the ABC enforces the interface at instantiation.

Test plan

  • pytest test/agy_cli_test.py test/gemini_cli_test.py — 52 passed
  • All four subclasses report no remaining abstract methods
  • agentevaluator.py imports cleanly, no concrete CLI references
  • Dockerised run logs: https://paste.googleplex.com/4915247398912000

…face for CLI-driven models and simplify AgentEvaluator logic
@prernakakkar-google
Copy link
Copy Markdown
Collaborator

/gcbrun

@prernakakkar-google
Copy link
Copy Markdown
Collaborator

Please manually test against docker and attach results here.

Comment thread evalbench/evaluator/agentevaluator.py Outdated
Comment thread evalbench/generators/models/agy_cli.py
IsmailMehdi
IsmailMehdi previously approved these changes Jun 5, 2026
@omkargaikwad23
Copy link
Copy Markdown
Collaborator Author

/gcbrun

@omkargaikwad23
Copy link
Copy Markdown
Collaborator Author

/gcbrun

@omkargaikwad23
Copy link
Copy Markdown
Collaborator Author

Please manually test against docker and attach results here.

Added. Please check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants