Skip to content

Add model_size experiment to 5 daily workflows; introduce small-agent alias#36997

Merged
pelikhan merged 2 commits into
mainfrom
copilot/select-daily-agentic-workflows
Jun 4, 2026
Merged

Add model_size experiment to 5 daily workflows; introduce small-agent alias#36997
pelikhan merged 2 commits into
mainfrom
copilot/select-daily-agentic-workflows

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 4, 2026

Wires a 50/50 model_size A/B experiment into 5 eligible daily workflows (no existing experiments, no subagents) to measure whether small models match full-model quality at lower token cost. Adds the small-agent builtin alias to make the experiment variants resolve at runtime.

Model alias

  • New small-agent alias: ["haiku", "gpt-5-mini", "gemini-flash", "any"] — parallel to agent but targeting small/fast models

Experiment

Added to daily-doc-updater, daily-function-namer, daily-doc-healer, daily-caveman-optimizer (claude), and daily-cache-strategy-analyzer (codex):

experiments:
  model_size:
    variants: [agent, small-agent]
    metric: effective_tokens_total
    secondary_metrics: [run_success_rate, run_duration_ms]
    guardrail_metrics:
      - name: run_success_rate
        threshold: ">=0.90"
      - name: empty_output_rate
        threshold: "<=0.10"
    min_samples: 20
    weight: [50, 50]
    start_date: "2026-06-04"

Each workflow's engine is expanded to pass the selected variant as the runtime model:

engine:
  id: claude
  model: "${{ needs.activation.outputs.model_size }}"

Copilot AI and others added 2 commits June 4, 2026 20:53
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…alias

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title Add model_size experiment to 5 daily workflows and small-agent model alias Add model_size experiment to 5 daily workflows; introduce small-agent alias Jun 4, 2026
Copilot AI requested a review from pelikhan June 4, 2026 21:00
@pelikhan pelikhan marked this pull request as ready for review June 4, 2026 22:30
Copilot AI review requested due to automatic review settings June 4, 2026 22:30
@pelikhan pelikhan merged commit b38bac1 into main Jun 4, 2026
@pelikhan pelikhan deleted the copilot/select-daily-agentic-workflows branch June 4, 2026 22:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new built-in model alias (small-agent) and wires a model_size A/B experiment into 5 daily workflow source files to compare small/fast models vs the default agent chain using token-cost and reliability metrics. It also updates several golden fixtures and many generated workflow lockfiles.

Changes:

  • Add small-agent to the built-in model alias map and extend unit tests accordingly.
  • Add a model_size experiment + dynamic engine.model selection to 5 daily workflow source markdown files.
  • Update WASM golden fixtures and many *.lock.yml generated workflows (including new Copilot settings injection).
Show a summary per file
File Description
pkg/workflow/data/model_aliases.json Adds the small-agent built-in alias chain.
pkg/workflow/model_aliases_test.go Extends built-in alias test coverage to include small-agent.
.github/workflows/daily-doc-updater.md Adds model_size experiment + sets engine.model from activation output.
.github/workflows/daily-doc-healer.md Adds model_size experiment + sets engine.model from activation output.
.github/workflows/daily-function-namer.md Adds model_size experiment + sets engine.model from activation output.
.github/workflows/daily-caveman-optimizer.md Adds model_size experiment + sets engine.model from activation output.
.github/workflows/daily-cache-strategy-analyzer.md Adds model_size experiment + sets engine.model from activation output (codex engine).
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/with-imports.golden Updates embedded AWF config golden output to include small-agent.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/smoke-copilot.golden Updates embedded AWF config golden output to include small-agent.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/playwright-cli-mode.golden Updates embedded AWF config golden output to include small-agent.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/basic-copilot.golden Updates embedded AWF config golden output to include small-agent.
pkg/workflow/testdata/TestWasmGolden_AllEngines/pi.golden Updates embedded AWF config golden output to include small-agent.
pkg/workflow/testdata/TestWasmGolden_AllEngines/gemini.golden Updates embedded AWF config golden output to include small-agent.
pkg/workflow/testdata/TestWasmGolden_AllEngines/copilot.golden Updates embedded AWF config golden output to include small-agent.
pkg/workflow/testdata/TestWasmGolden_AllEngines/codex.golden Updates embedded AWF config golden output to include small-agent.
pkg/workflow/testdata/TestWasmGolden_AllEngines/claude.golden Updates embedded AWF config golden output to include small-agent.
.github/workflows/daily-mcp-concurrency-analysis.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-malicious-code-scan.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-issues-report.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-hippo-learn.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-geo-optimizer.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-firewall-report.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-file-diet.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-experiment-report.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-compiler-threat-spec-optimizer.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-compiler-quality.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-community-attribution.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-cli-tools-tester.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-cli-performance.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-byok-ollama-test.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-assign-issue-to-user.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-architecture-diagram.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-ambient-context-optimizer.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/daily-agent-of-the-day-blog-writer.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/craft.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/copilot-pr-prompt-analysis.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/copilot-pr-nlp-analysis.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/copilot-pr-merged-report.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/copilot-opt.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/copilot-cli-deep-research.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/contribution-check.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/constraint-solving-potd.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/code-simplifier.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/code-scanning-fixer.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/cli-consistency-checker.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/ci-coach.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/chaos-pr-bundle-fuzzer.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/breaking-change-checker.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/brave.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/bot-detection.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/auto-triage-issues.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/artifacts-summary.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/architecture-guardian.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/archie.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/agent-persona-explorer.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/agent-performance-analyzer.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/ace-editor.lock.yml Injects Copilot settings file write + cleanup trap in run script.
.github/workflows/ab-testing-advisor.lock.yml Injects Copilot settings file write + cleanup trap in run script.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 58/58 changed files
  • Comments generated: 3

Comment on lines +33 to +40
engine:
id: claude
model: "${{ needs.activation.outputs.model_size }}"
name: Daily Documentation Updater
strict: true
experiments:
model_size:
variants: [agent, small-agent]
Comment on lines +939 to +941
trap 'rm -f /home/runner/.copilot/settings.json' EXIT
mkdir -p /home/runner/.copilot
printf '%s' '{"builtInAgents":{"rubberDuck":false}}' > /home/runner/.copilot/settings.json
Comment on lines +939 to +941
trap 'rm -f /home/runner/.copilot/settings.json' EXIT
mkdir -p /home/runner/.copilot
printf '%s' '{"builtInAgents":{"rubberDuck":false}}' > /home/runner/.copilot/settings.json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants