Skip to content

MAINT/FIX: Relocate targeted_harm_categories to AttackResult#1995

Merged
rlundeen2 merged 4 commits into
microsoft:mainfrom
rlundeen2:rlundeen2/phase-13-planning
Jun 12, 2026
Merged

MAINT/FIX: Relocate targeted_harm_categories to AttackResult#1995
rlundeen2 merged 4 commits into
microsoft:mainfrom
rlundeen2:rlundeen2/phase-13-planning

Conversation

@rlundeen2

@rlundeen2 rlundeen2 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

target_harm_categories was prematurely removed in #1951 (it was meant to be moved from messagePiece to AttackResult I think, but instead it was removed).

This adds it back and routes it through. With this, phases 13–15 are complete except labels (scheduled for removal next release)

https://gist.github.com/rlundeen2/3e8daa8e12a11b4b6e52587b3c9b1dca

rlundeen2 and others added 3 commits June 11, 2026 18:54
Phase 14 of the models refactor: restore targeted_harm_categories as a
first-class field on AttackResult (regression from microsoft#1951, which deleted it
outright instead of relocating it).

- Add targeted_harm_categories: list[str] to AttackResult
- Add nullable JSON column to AttackResultEntry with read/write round-trip
- New Alembic migration c3d5e7f9a1b2 (no backfill; source column already
  dropped and 0.15.0 is unreleased dev)
- Auto-populate from SeedGroup.harm_categories via AttackParameters
- Stamp centrally in AttackStrategy success/error paths
- Re-add as a first-class (non-deprecated) get_attack_results filter
- Tests for round-trip, params capture, central stamping, and filtering

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Drop historical/decision narrative from comments (e.g. 'replaces the
removed per-piece filter', migration 'where it belongs') in favor of
present-tense descriptions of current behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ty flagged SeedGroup(seeds=all_prompts): list[SeedPrompt] is not
assignable to the invariant list[SeedUnion] parameter. Build all_prompts
as an explicitly-typed list[SeedUnion] display so element types are
checked against the union directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread pyrit/models/results/attack_result.py
@hannahwestra25

Copy link
Copy Markdown
Contributor

both 3_memory_data_types and 1_sqlite_memory notebooks need to be updated too

Comment thread pyrit/executor/attack/core/attack_strategy.py
@hannahwestra25 hannahwestra25 self-assigned this Jun 12, 2026
…tion

- Include targeted_harm_categories in AttackResult.to_dict()/from_dict() shims
  so the deprecated round-trip does not silently drop the field.
- Restore targeted_harm_categories plumbing through the technique-selection
  chain (SelectorScope -> epsilon_greedy -> compute_technique_stats) so the
  re-homed get_attack_results filter is reachable.
- Update memory docs: drop stale MessagePiece columns and refresh the
  1_sqlite_memory schema output to match the current models.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rlundeen2

Copy link
Copy Markdown
Contributor Author

Rich agrees with this comment but it is copilot generated: Done. Updated 3_memory_data_types.md (dropped stale MessagePiece columns, added targeted_harm_categories under AttackResult) and refreshed the 1_sqlite_memory.ipynb schema output to match the current models (also picked up the pre-existing staleness: Conversations table now shown, and attack_identifier/prompt_target_identifier removed from PromptMemoryEntries). Fixed in 227a256.

@rlundeen2 rlundeen2 added this pull request to the merge queue Jun 12, 2026
Merged via the queue into microsoft:main with commit 5eee461 Jun 12, 2026
53 checks passed
@rlundeen2 rlundeen2 deleted the rlundeen2/phase-13-planning branch June 12, 2026 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants