[progenitor-client] add Error::is_retryable#1308
Conversation
Created using spr 1.3.6-beta.1
| } | ||
| } | ||
|
|
||
| /// Returns `true` if this error is likely transient and the operation |
There was a problem hiding this comment.
what does "likely" mean? Would it be more accurate to say "possibly" or "potentially"?
There was a problem hiding this comment.
Well, my thought was that we can't be sure it's transient, but also that the status code is a strong sign it's transient. Felt "likely" was the best fit though I don't care too much about this and I'm happy to switch.
There was a problem hiding this comment.
Changed this to:
/// Returns `true` if this error indicates that it is transient, and that
/// the operation could succeed if retried.
| /// * 504 Gateway Timeout | ||
| /// | ||
| /// These status codes are checked for all error variants that carry an | ||
| /// HTTP status, since infrastructure components like load balancers |
There was a problem hiding this comment.
| /// HTTP status, since infrastructure components like load balancers | |
| /// HTTP status, since infrastructure components such as load balancers |
| /// may return transient errors with bodies that don't conform to the | ||
| /// API schema. |
There was a problem hiding this comment.
Was trying to draw a distinction between ErrorResponse (known, expected error type) and UnexpectedResponse (unknown error type) here. My thought was that a load balancer or similar might return an error that doesn't follow the schema—not a problem we'd generally have for our internal services.
But on second thought, an unknown error type is more general than that, so we should just drop this comment. The right thing to do is to check all the variants anyway.
| // We expect this to be a 2xx status, but that assumption is a | ||
| // bit fragile. is_retryable_status returns false for 2xx status | ||
| // codes anyway, so check for that. |
There was a problem hiding this comment.
Can you explain this fragility? Or is it more like: these should be 2xx but there's no harm in checking if we happen to have a status code
There was a problem hiding this comment.
Right, this. Can attempt to clarify this in the comment.
There was a problem hiding this comment.
Done, and in particular talked about the situations where they might not be 2xx statuses (since this is an enum, something outside progenitor could create it—I'm pretty sure I've written tests of that kind)
Created using spr 1.3.6-beta.1
Note this check is somewhat broader than the one currently in Omicron. That one only checks for 503 and 429, while the check here is a bit more complete based on HTTP semantics: