Bug report
Description:
When a subgraph encounters a non-deterministic error (e.g., DatabaseUnavailable),
the unfail mechanism only attempts once. If the first unfail attempt occurs before
the subgraph has processed past the error block, it returns UnfailOutcome::Noop,
but the should_try_unfail_non_deterministic flag is set to false and never
retried.
This causes subgraphs to remain permanently in Failed state even though they
continue indexing successfully.
Reproduction:
- Subgraph encounters DatabaseUnavailable at block N
- Database recovers, subgraph restarts from checkpoint at block N-3
- First unfail attempt happens at block N-3 (< N), returns Noop
- Flag is set to false, never retried
- Subgraph continues indexing to N+1000, but health remains "failed"
Evidence:
Log showing the issue:
INFO Subgraph error is still ahead of deployment head, nothing to unfail,
error_block_range: (Included(392332788), Unbounded),
block_number: 392332785
Location:
core/src/subgraph/runner.rs:996
Suggested Fix:
Only set should_try_unfail_non_deterministic = false when UnfailOutcome::Unfailed,
keep it true when UnfailOutcome::Noop to retry on next block.
Relevant log output
IPFS hash
No response
Subgraph name or link to explorer
No response
Some information to help us out
OS information
None
Bug report
Description:
When a subgraph encounters a non-deterministic error (e.g., DatabaseUnavailable),
the unfail mechanism only attempts once. If the first unfail attempt occurs before
the subgraph has processed past the error block, it returns
UnfailOutcome::Noop,but the
should_try_unfail_non_deterministicflag is set tofalseand neverretried.
This causes subgraphs to remain permanently in
Failedstate even though theycontinue indexing successfully.
Reproduction:
Evidence:
Log showing the issue:
INFO Subgraph error is still ahead of deployment head, nothing to unfail,
error_block_range: (Included(392332788), Unbounded),
block_number: 392332785
Location:
core/src/subgraph/runner.rs:996Suggested Fix:
Only set
should_try_unfail_non_deterministic = falsewhenUnfailOutcome::Unfailed,keep it
truewhenUnfailOutcome::Noopto retry on next block.Relevant log output
IPFS hash
No response
Subgraph name or link to explorer
No response
Some information to help us out
OS information
None