feat: PagerDuty integration for high-severity incident paging#146
feat: PagerDuty integration for high-severity incident paging#146spalmurray wants to merge 19 commits intomainfrom
Conversation
…on_incident_created
52239d7 to
9853ee6
Compare
9853ee6 to
5824152
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit a5cc893. Configure here.
| "Failed to fetch oncall users from PagerDuty", | ||
| extra={"escalation_policy_id": escalation_policy_id}, | ||
| ) | ||
| return [] |
There was a problem hiding this comment.
New get_oncall_users method is unused in production
Low Severity
The get_oncall_users method on PagerDutyService is never called from any production code path. A grep confirms it's only referenced in test_pagerduty.py. While trigger_incident is used by _page_high_sev_if_needed in the hooks module, get_oncall_users has no caller outside of tests, making it dead code in the current changeset.
Reviewed by Cursor Bugbot for commit a5cc893. Configure here.
This reverts commit 0cb3500196c5a205a36d7c6f86db772d09041a6f.
| slack_link.save(update_fields=["url"]) | ||
| except Exception: | ||
| slack_link.delete() | ||
| raise | ||
| logger.exception( | ||
| f"Failed to create Slack channel for incident {incident.id}" | ||
| ) | ||
|
|
There was a problem hiding this comment.
Bug: If saving the SlackLink model fails, the channel_id is not cleared, leading to subsequent operations on an orphaned Slack channel not linked to the incident.
Severity: MEDIUM
Suggested Fix
Wrap the SlackLink.objects.create() call in a try...except block. In the except block, ensure the channel_id is reset to None to prevent further operations on the orphaned channel.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/firetower/incidents/hooks.py#L175-L181
Potential issue: If the `SlackLink` model fails to save after a new Slack channel has
been successfully created, the `channel_id` variable is not reset. As a result,
subsequent code will attempt to perform operations like setting the topic, adding
bookmarks, or posting messages to this 'orphaned' channel. This channel exists in Slack
but is not linked to any incident in the application's database, leading to an
inconsistent state and breaking the incident management workflow for that specific
event.


Adds PagerDuty integration to firetower. On P0/P1 incident creation or severity upgrade, automatically triggers a PagerDuty incident to page the high-severity escalation policy. Includes PagerDuty service client, config, incident hooks integration, and tests.
Need to configure env vars and test in test env