DOC add scenario update blog by hannahwestra25 · Pull Request #1940 · microsoft/PyRIT

hannahwestra25 · 2026-06-04T20:28:16Z

Description

Add blog post for scenarios overall plus updates in 0.13.0-0.14.0

Tests and Documentation

add blog post

Covers AttackTechnique abstraction (microsoft#1592), AttackTechniqueRegistry (microsoft#1611), standardized attack args (microsoft#1608), and v0.14.0 scenario improvements including the new TextAdaptive scenario and EpsilonGreedyTechniqueSelector. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Restructure around the framework/scanner/GUI three-layer story, add a tour of the scenario catalog with a RapidResponse highlight, tighten the v0.13/v0.14 updates section, and shorten the adaptive scenarios section to the demo-script framing (registry + selector + ASR + cross-run learning). Note GUI roadmap for scenarios in the wrap-up. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Remove the 'Where scenarios fit' section; fold the scanner mention into the intro. - Drop the colored dots from scenario flavor headers. - Reframe the RapidResponse opener as a broad starter scan / jumping-off point for deeper testing, rather than an urgent-incident scenario. - Expand the v0.13/v0.14 section with concrete usage detail for AttackTechnique, AttackTechniqueRegistry, Better Scenario Tracking, and SequentialAttack. - Audit baseline language so baseline reads as a generic Scenario feature, not a RapidResponse- or adaptive-specific behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Drop 'two people running the same pass...' sentence; switch opener to 'Enter scenarios.' - Add a 'Strategies - the runtime knob' bullet so the runtime-selection concept is introduced before it's used in RapidResponse / adaptive sections. - Add adaptive scenarios as a fifth flavor in the catalog with a link to its section; mark RapidResponse and AdversarialBenchmark as new in v0.14. - Rework RapidResponse closer-look: replace 'where it's soft' phrasing, drop the 'crosses two axes' framing (every scenario does that), keep the grouping detail as the actually-RR-specific bit. - Apply the 'less about building out our scenario library' wording, then explicitly surface the two new v0.14 scenarios so the section isn't misleading about scope. - Strip every github.com PR link. - Add two new paragraphs: 'Configuration from the CLI and from YAML' and 'Parallel execution within a scenario' to cover microsoft#1680 and microsoft#1783. - Add a mermaid diagram to the Better Scenario Tracking paragraph showing scenario_run_id flowing into memory and out to resume / analytics / printer / adaptive selector. - Drop the 'Smaller polish worth knowing about' paragraph. - Pull ASR mention into the adaptive section's opening paragraph. - Point the 'Where to go next' bullet at the real adaptive notebook path (doc/code/scenarios/3_adaptive_scenarios.ipynb). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…, and TextAdaptive bandit trail Three TODO IMAGE blocks added (all HTML-commented so the docs build stays green): 1. After the 'There are five flavors' opener: pyrit_scan list-scenarios catalog screenshot (2026_06_04_scan_list_scenarios.png). 2. End of the RapidResponse 'A closer look' section: ConsoleScenarioResultPrinter summary showing baseline + per-harm-category breakdown sorted by success rate (2026_06_04_rapid_response_output.png). 3. After the TextAdaptive code block: per-objective technique trail + wins/picks/ rate summary, captured from doc/code/scenarios/3_adaptive_scenarios.ipynb (2026_06_04_text_adaptive_output.png). To activate any of them: drop the PNG into doc/blog/ with the indicated filename and uncomment the markdown image line inside the comment block. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ructure Moved the diagram from the 'Attribution that survives runs' section (where the surrounding prose already covers the same ground) to the 'A catalog those techniques live in' paragraph, where it explains the v0.13 abstraction story visually: techniques register with tags into AttackTechniqueRegistry, TagQuery subsets become ScenarioStrategy enum members, --strategies picks one at run time, and the scenario fans the chosen techniques out across its datasets. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Replace 'operators' with 'red teamers' to match repo terminology. - Reword 'role-play scenario' bullet to 'role-play prompt template' so the word scenario isn't overloaded mid-sentence. - Add a fourth screenshot placeholder + explanatory paragraph showing what 'pyrit_scan airt.rapid_response --target my_target' actually does at the CLI; this is also where the post introduces initializers (ScenarioTechnique- Initializer, TargetInitializer, LoadDefaultDatasets) as the things that populate the registries before run_async fires. - Reword the RapidResponse 'technique side pulls / dataset side covers' sentence as a single 'It runs N techniques across N datasets' line. - Drop the baseline mention from the RapidResponse paragraph (already covered in 'What's in a scenario') and replace it with a concrete explanation of what 'group by harm category vs by technique' means in the printer. - Rebuild the mermaid diagram. It now spans the full chain: AttackStrategy + seed config -> AttackTechnique -> AttackTechniqueRegistry (via Scenario- TechniqueInitializer) -> ScenarioStrategy -> Scenario + datasets -> AtomicAttack -> ScenarioResult, with a follow-on paragraph that defines AttackStrategy, AttackTechnique, and AtomicAttack in plain English. - Expand the opening paragraph and adaptive intro slightly with more 'why' (reproducibility, comparing runs, real cost of brute-force scans). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Apply CLI command suggestion (line 25 wording) - Fix broken Azure AI Foundry red-teaming link; drop `throw everything` phrasing - Remove ContentHarms catalog bullet (deprecated in favor of RapidResponse) - Remove RapidResponse strategy-tags paragraph - Restructure `What's improved in v0.13.0 and v0.14.0` section into proper ### subsections instead of bold lead-in clauses, and drop the `Two new scenarios` sentence per suggestion - Shorten mermaid diagram node and edge labels so they stop getting cut off - Convert post-diagram definitions paragraph to a bullet list - Note EpsilonGreedyTechniqueSelector is the default and TechniqueSelector is pluggable - Remove prompt_sending baseline paragraph from the adaptive section Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Add the captured TextAdaptive screenshot (per-objective trail + wins/picks summary) and remove the corresponding TODO IMAGE block - Remove the list-scenarios TODO IMAGE block - Remove the RapidResponse 'distinctive grouping' paragraph and the RapidResponse TODO IMAGE block - Remove the 'Sorted breakdowns and a new compound primitive' subsection - Apply the brute-force opener suggestion verbatim on the adaptive section intro - Rewrite the closing wrap-up so the post ends on a tighter practical handoff - Normalize all version references to 0.x.0 (strip 'v' prefix everywhere) and update the cross-reference anchor accordingly Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fixes the validate-docs pre-commit failure that flagged blog/2026_06_04_scenarios.md as an orphaned file. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

romanlutz

Very cool! A few smaller things to consider.

…ra/scenario_blog

Update the blog post to reflect the actual 10 core-tagged techniques registered in ScenarioTechniqueInitializer: adds pair, and the three persona-driven crescendo variants (movie_director, history_lecture, journalist_interview). Removes prompt_sending which is baseline, not a registered technique. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Replace screenshot with text code block (comment 1) - Fix wording: 'kick off a scan' instead of 'create a scenario' (comment 2) - Fix CLI flag --config to --config-file; clarify which TextAdaptive params are CLI-reachable vs Python-only (comment 3) - Lowercase 'Better Scenario Tracking' to avoid proper-noun read (comment 4) - Add qualifier that not all scenarios expose the same strategy set; move light to parenthetical example (comment 5) - Fix AIRT scenario count from seven to six (comment 6) - Clarify default strategy runs role_play + many_shot + baseline; full core pool available via --strategies single_turn/multi_turn (comment 7) - Remove unused text_adaptive_output.png image Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Comment 7: Don't hard-code technique count (it'll drift). Removed 'currently ten' and the full enumerated list; now describes the pool as 'every core-tagged factory in the registry' and lists specific techniques only in the parenthetical for the full-pool strategies. Comment 3: Drop 'parameter-heavy scenarios like TextAdaptive' framing (it only has one CLI param). Rewrite to clearly say which knobs go through supported_parameters()/CLI vs Python constructor. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

rlundeen2 · 2026-06-09T19:16:04Z

+
+Cracking one open, every scenario bundles five things:
+
+- **Techniques — the *how*.** How are we going to attack? Maybe we just send the prompt directly. Maybe we wrap it in a role-play prompt template. Maybe we escalate over multiple turns with Crescendo or TAP. Techniques include the attack strategy plus its converters, jailbreak templates, and adversarial-chat configuration — basically all the knobs that affect how the attack is crafted and delivered.


we may want to combine this bullet with "Strategies"

rlundeen2 · 2026-06-09T19:17:26Z

+
+Its technique pool draws from every `core`-tagged factory in the registry. Out of the box (`--strategies default`) it runs `role_play` and `many_shot` plus a baseline pass; switch to `--strategies single_turn` or `multi_turn` to open up the full pool (which currently includes `tap`, `pair`, `crescendo_simulated`, `red_teaming`, `context_compliance`, and several crescendo persona variants). It tests across seven AIRT datasets (`airt_hate`, `airt_fairness`, `airt_violence`, `airt_sexual`, `airt_harassment`, `airt_misinformation`, `airt_leakage`). By default it sends four prompts per dataset, configurable with `--max-dataset-size`.
+
+## What's improved in 0.13.0 and 0.14.0


I might nix some of the journey (e.g. skip this section and what's changed). I'd focus on what is there now and what value it gets.

rlundeen2 · 2026-06-09T19:19:36Z

+pyrit_scan airt.rapid_response --target my_target
+```
+
+That one command does a lot. Before the scenario itself runs, **initializers** (`PyRITInitializer` subclasses such as `ScenarioTechniqueInitializer`, `TargetInitializer`, and `LoadDefaultDatasets`) populate the registries — every technique factory lands in `AttackTechniqueRegistry`, every configured target in `TargetRegistry`, every default dataset for the chosen scenario in memory. Only then does the CLI look up `airt.rapid_response` in the scenario registry, resolve `my_target` against `TargetRegistry`, instantiate `RapidResponse`, and call `run_async()`. You get back a `ScenarioResult` persisted to memory and pretty-printed at the end. No notebook glue required, and the same scenario class is what you'd `await` in a notebook if you preferred to drive it from Python.


One important slant, a lot of this reads a bit like docs. I'd try to include the why. E.g. we can add techniques from the red team for anyone to run. So you can pick a category of an attack, see how resilient your AI system is to it :)

rlundeen2 · 2026-06-09T19:19:46Z

+**Foundry — `RedTeamAgent`.** The integration with [Azure AI Foundry's Red Teaming Agent](https://learn.microsoft.com/en-us/azure/foundry/concepts/ai-red-teaming-agent). Organized by complexity (easy / moderate / difficult) rather than harm type. Easy = converters like Base64, ROT13. Difficult = multi-turn attacks like TAP and Crescendo. Built on HarmBench with 25+ techniques.
+
+**Garak — `Encoding`.** Inspired by the [Garak](https://github.com/leondz/garak) project. Very focused: can the model be tricked into decoding and repeating harmful content? Tests 17 encoding schemes (Base64, Braille, Morse, Leet Speak, …) against slur terms and XSS payloads. Single-turn only, with a custom `DecodingScorer`. Niche but important for encoding-bypass vulnerabilities.
+


Consider including a video too!

hannahwestra25 and others added 11 commits June 4, 2026 11:28

DOC: Remove final scan_run_scenario TODO image block

cc4b08c

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

DOC: Register scenarios blog post in myst.yml TOC

d498190

Fixes the validate-docs pre-commit failure that flagged blog/2026_06_04_scenarios.md as an orphaned file. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

romanlutz reviewed Jun 5, 2026

View reviewed changes

Comment thread doc/blog/2026_06_04_text_adaptive_output.png Outdated

romanlutz reviewed Jun 5, 2026

View reviewed changes

hannahwestra25 and others added 4 commits June 8, 2026 09:51

Merge branch 'main' of https://github.com/microsoft/PyRIT into hawest…

571ea19

…ra/scenario_blog

rlundeen2 reviewed Jun 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC add scenario update blog#1940

DOC add scenario update blog#1940
hannahwestra25 wants to merge 15 commits into
microsoft:mainfrom
hannahwestra25:hawestra/scenario_blog

hannahwestra25 commented Jun 4, 2026

Uh oh!

Uh oh!

romanlutz left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rlundeen2 Jun 9, 2026

Uh oh!

rlundeen2 Jun 9, 2026 •

edited

Loading

Uh oh!

rlundeen2 Jun 9, 2026

Uh oh!

rlundeen2 Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		Cracking one open, every scenario bundles five things:

		- *Techniques — the how.* How are we going to attack? Maybe we just send the prompt directly. Maybe we wrap it in a role-play prompt template. Maybe we escalate over multiple turns with Crescendo or TAP. Techniques include the attack strategy plus its converters, jailbreak templates, and adversarial-chat configuration — basically all the knobs that affect how the attack is crafted and delivered.


		Its technique pool draws from every `core`-tagged factory in the registry. Out of the box (`--strategies default`) it runs `role_play` and `many_shot` plus a baseline pass; switch to `--strategies single_turn` or `multi_turn` to open up the full pool (which currently includes `tap`, `pair`, `crescendo_simulated`, `red_teaming`, `context_compliance`, and several crescendo persona variants). It tests across seven AIRT datasets (`airt_hate`, `airt_fairness`, `airt_violence`, `airt_sexual`, `airt_harassment`, `airt_misinformation`, `airt_leakage`). By default it sends four prompts per dataset, configurable with `--max-dataset-size`.

		## What's improved in 0.13.0 and 0.14.0

		Foundry — `RedTeamAgent`. The integration with [Azure AI Foundry's Red Teaming Agent](https://learn.microsoft.com/en-us/azure/foundry/concepts/ai-red-teaming-agent). Organized by complexity (easy / moderate / difficult) rather than harm type. Easy = converters like Base64, ROT13. Difficult = multi-turn attacks like TAP and Crescendo. Built on HarmBench with 25+ techniques.

		Garak — `Encoding`. Inspired by the [Garak](https://github.com/leondz/garak) project. Very focused: can the model be tricked into decoding and repeating harmful content? Tests 17 encoding schemes (Base64, Braille, Morse, Leet Speak, …) against slur terms and XSS payloads. Single-turn only, with a custom `DecodingScorer`. Niche but important for encoding-bypass vulnerabilities.

Conversation

hannahwestra25 commented Jun 4, 2026

Description

Tests and Documentation

Uh oh!

Uh oh!

romanlutz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rlundeen2 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

rlundeen2 Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rlundeen2 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

rlundeen2 Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rlundeen2 Jun 9, 2026 •

edited

Loading