Conversation
| return self._extract_head(data_packet) - self._extract_base(data_packet) | ||
| case "relative_diff": | ||
| base = self._extract_base(data_packet) | ||
| return ((self._extract_head(data_packet) - base) / base) * 100 |
There was a problem hiding this comment.
Division by zero in extract_value when base size is zero
When threshold_type is relative_diff, the code divides by base without checking if it's zero. If base_install_size_bytes or base_download_size_bytes is 0 in the data packet, self._extract_base() returns 0 and the expression ((self._extract_head(data_packet) - base) / base) * 100 raises ZeroDivisionError. This crashes the detector evaluation, preventing issue detection for that data packet.
Verification
Traced extract_value method at line 239-250 in detectorconfig.py. Confirmed _extract_base returns int from data_packet.packet.get('base_install_size_bytes') or base_download_size_bytes. The TypedDict in grouptype.py defines these as int | None, meaning 0 is a valid value. Division occurs at line 248 without zero-check. The helper function _build_evidence_text at line 131 correctly guards against this with if base_bytes else 0.
Identified by Warden sentry-backend-bugs · 8X5-S26
| recovery_threshold = self.detector.config["recovery_threshold"] | ||
| downtime_threshold = self.detector.config["downtime_threshold"] |
There was a problem hiding this comment.
KeyError accessing threshold config keys before backfill migration completes
The code accesses self.detector.config["recovery_threshold"] and self.detector.config["downtime_threshold"] with direct bracket notation at lines 97-98 and 208. The backfill migration 0045_backfill_detector_thresholds.py that adds these keys is marked is_post_deployment = True, meaning it runs after code deployment. During this window, existing uptime detectors will lack these keys, causing KeyError when the handler processes check results.
Verification
Verified by reading the migration at src/sentry/uptime/migrations/0045_backfill_detector_thresholds.py which shows is_post_deployment = True at line 51. Traced code path from consumer at results_consumer.py:468 calling process_detectors() which invokes the detector handler and accesses the thresholds property. Also verified line 208 has the same issue.
Suggested fix: Use .get() with default values to handle missing keys gracefully during the migration window
| recovery_threshold = self.detector.config["recovery_threshold"] | |
| downtime_threshold = self.detector.config["downtime_threshold"] | |
| from sentry.uptime.types import DEFAULT_DOWNTIME_THRESHOLD, DEFAULT_RECOVERY_THRESHOLD | |
| recovery_threshold = self.detector.config.get("recovery_threshold", DEFAULT_RECOVERY_THRESHOLD) | |
| downtime_threshold = self.detector.config.get("downtime_threshold", DEFAULT_DOWNTIME_THRESHOLD) | |
| from sentry.uptime.types import DEFAULT_DOWNTIME_THRESHOLD | |
| threshold = self.detector.config.get("downtime_threshold", DEFAULT_DOWNTIME_THRESHOLD) |
Identified by Warden sentry-backend-bugs · A4S-9GJ
Backend Test FailuresFailures on
|
saponifi3d
left a comment
There was a problem hiding this comment.
high level, i really like this approach and simplifying the DetectorSettings on the group; i also really like the way this means that we don't pull in grouptype information into the detector models.
starting to be more nit-picky, i think we might be over-loading the types file a bit. maybe we could pull some of that code out and organize it consistently with the other models / registries internally in the module (i'd also def be open to discussing how we're organizing that stuff, i mostly just would like it to be consistent in the workflow_engine module)
| ) | ||
|
|
||
|
|
||
| detector_settings_registry.register( |
There was a problem hiding this comment.
🤔 - i really like how this pulls out the handlers and makes it a lot easier to access
| ) | ||
|
|
||
| # maps to registry (sentry.issues.grouptype.registry) entries for GroupType.slug in sentry.issues.grouptype.GroupType | ||
| type = models.CharField(max_length=200) |
There was a problem hiding this comment.
should this enforce the types now that there's a consistent list?
| event_types: list[SnubaQueryEventType.EventType] | ||
|
|
||
|
|
||
| class DetectorType(StrEnum): |
There was a problem hiding this comment.
one of the goals for using the group types info was so that folks wouldn't need to modify the workflow_engine module to create a new detector. This is moves us towards having a list of detectors to maintain internally -- any thoughts on ways we could get the convenience of this, but the flexibility of being able to register detectors?
My other nit would be that on the other models we've nested the Type class in the parent;
class Detector():
class Type():
...
should we be consistent? either moving everything into a nested type directory or move this to be nested in the detector?
| filter: Q | None = None | ||
|
|
||
|
|
||
| class DetectorSettingsRegistry: |
There was a problem hiding this comment.
can we move this into the registries directory and use the consistent base class so the interactions are the same?
| detector_settings_registry = DetectorSettingsRegistry() | ||
|
|
||
|
|
||
| def get_detector_settings(group_type: type[GroupType]) -> DetectorSettings | None: |
There was a problem hiding this comment.
🤔 thoughts on this just being a util instead of in types?
Goal is simple: GroupType use shouldn't result in importing anything substantial from workflow engine, let alone specific validator implementations.
We break the chain with an explicit DetectorType thing that should exist for every GroupType we have meaningful Detector implementations for.