Skip to content

chore(llm-detection): Store trace id, category, and subcategory in issue metadata#109430

Open
nora-shap wants to merge 1 commit intomasterfrom
nora/metadata-hack
Open

chore(llm-detection): Store trace id, category, and subcategory in issue metadata#109430
nora-shap wants to merge 1 commit intomasterfrom
nora/metadata-hack

Conversation

@nora-shap
Copy link
Member

This feels hacky, but I'm wondering if it's possible to include these additional fields on the issue, without displaying them on the frontend, just to make issue quality analysis queries easier.

Summary

  • Adds category, subcategory, and trace_id to the event metadata for LLM-detected issues
  • These fields flow through the existing ingestion pipeline and persist to sentry_groupedmessage.data["metadata"]
  • Enables querying for an llm-detected issue by trace_id
  • category, subcategory can now be accessed from the db without having to cross-reference logs

@nora-shap nora-shap requested a review from a team as a code owner February 26, 2026 02:11
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Feb 26, 2026
Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.

Comment on lines +193 to +197
"metadata": {
"trace_id": trace_id,
"category": detected_issue.category,
"subcategory": detected_issue.subcategory,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The metadata dictionary from the LLM detection task is overwritten in the occurrence consumer, losing the trace_id, category, and subcategory fields.
Severity: HIGH

Suggested Fix

In occurrence_consumer.py, modify the logic to merge the existing metadata with the new title field instead of overwriting it. Retrieve the existing metadata from the event payload and update it with the title, rather than creating a new dictionary.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: src/sentry/tasks/llm_issue_detection/detection.py#L193-L197

Potential issue: In `detection.py`, an event is created with a `metadata` dictionary
containing `trace_id`, `category`, and `subcategory`. This event is sent via Kafka to
the occurrence consumer. However, in `occurrence_consumer.py`, the `_get_kwargs`
function unconditionally overwrites the `metadata` field with a new dictionary
containing only the `issue_title`. As a result, the original metadata fields
(`trace_id`, `category`, `subcategory`) are lost when the event is saved. This prevents
the ability to query for LLM-detected issues by `trace_id` or access the other metadata
fields from the database.

Did we get this right? 👍 / 👎 to inform future reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant