Skip to content

feat: PagerDuty integration for high-severity incident paging#146

Open
spalmurray wants to merge 19 commits intomainfrom
spalmurray/RELENG-465
Open

feat: PagerDuty integration for high-severity incident paging#146
spalmurray wants to merge 19 commits intomainfrom
spalmurray/RELENG-465

Conversation

@spalmurray
Copy link
Copy Markdown
Contributor

@spalmurray spalmurray commented Apr 7, 2026

Adds PagerDuty integration to firetower. On P0/P1 incident creation or severity upgrade, automatically triggers a PagerDuty incident to page the high-severity escalation policy. Includes PagerDuty service client, config, incident hooks integration, and tests.

Need to configure env vars and test in test env

@linear-code
Copy link
Copy Markdown

linear-code bot commented Apr 7, 2026

@spalmurray spalmurray marked this pull request as ready for review April 7, 2026 20:34
@spalmurray spalmurray requested a review from a team as a code owner April 7, 2026 20:34
@spalmurray spalmurray force-pushed the spalmurray/RELENG-465 branch from 52239d7 to 9853ee6 Compare April 7, 2026 21:31
@spalmurray spalmurray force-pushed the spalmurray/RELENG-465 branch from 9853ee6 to 5824152 Compare April 7, 2026 21:31
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit a5cc893. Configure here.

"Failed to fetch oncall users from PagerDuty",
extra={"escalation_policy_id": escalation_policy_id},
)
return []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New get_oncall_users method is unused in production

Low Severity

The get_oncall_users method on PagerDutyService is never called from any production code path. A grep confirms it's only referenced in test_pagerduty.py. While trigger_incident is used by _page_high_sev_if_needed in the hooks module, get_oncall_users has no caller outside of tests, making it dead code in the current changeset.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit a5cc893. Configure here.

Comment on lines +175 to 181
slack_link.save(update_fields=["url"])
except Exception:
slack_link.delete()
raise
logger.exception(
f"Failed to create Slack channel for incident {incident.id}"
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: If saving the SlackLink model fails, the channel_id is not cleared, leading to subsequent operations on an orphaned Slack channel not linked to the incident.
Severity: MEDIUM

Suggested Fix

Wrap the SlackLink.objects.create() call in a try...except block. In the except block, ensure the channel_id is reset to None to prevent further operations on the orphaned channel.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: src/firetower/incidents/hooks.py#L175-L181

Potential issue: If the `SlackLink` model fails to save after a new Slack channel has
been successfully created, the `channel_id` variable is not reset. As a result,
subsequent code will attempt to perform operations like setting the topic, adding
bookmarks, or posting messages to this 'orphaned' channel. This channel exists in Slack
but is not linked to any incident in the application's database, leading to an
inconsistent state and breaking the incident management workflow for that specific
event.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant