Add support for custom election tick#9725
Conversation
|
Hi, just checking in on this PR since it has been waiting for review for about 4 days. Could you please take a look when you get a chance? I would appreciate any feedback or approval so I can move it forward. Thank you. |
| func NewNode(rc *pb.RaftContext, store *raftwal.DiskStorage, tlsConfig *tls.Config, | ||
| electionTick int) *Node { | ||
|
|
||
| if electionTick <= 0 { |
There was a problem hiding this comment.
The current guard only handles electionTick <= 0. But HeartbeatTick is hardcoded to 1 just below, and etcd raft requires ElectionTick > HeartbeatTick. If an operator sets election-tick=1, Config.validate()
returns "election tick must be greater than heartbeat tick" and newRaft panics during StartNode a cryptic crash on boot that never mentions the flag they set.
Since this is the only place a raft.Config is built (every Alpha node and Zero flow through NewNode), the heartbeat floor of 1 applies everywhere, so election-tick can never legally be 1. Better to fail fast with
a clear message than let raft panic
There was a problem hiding this comment.
Good catch, this has been addressed in the latest commit.
Added validation at lines 91–94:
const heartbeatTick = 1 // 100ms per tick
if electionTick <= 0 {
electionTick = 20
}
if electionTick <= heartbeatTick {
glog.Fatalf(
"election-tick=%d is invalid: must be greater than heartbeat-tick (%d). "+
"Recommended minimum is 10 (1s election timeout).",
electionTick,
heartbeatTick,
)
}Now, if a user sets:
--raft "election-tick=1"the process fails fast during startup with a clear validation error instead of encountering a cryptic Raft panic later during initialization.
Thanks for reviewing, let me know if any changes/explaination req.
There was a problem hiding this comment.
On second thought, do you think it would make sense to add a stricter fail-safe and reject election tick values below 10 altogether? Since the recommendation is already 10 (1s election timeout), allowing smaller values may not be particularly useful and could lead to unstable configurations.
Similarly, would it be worth introducing an upper bound as well (for example, around 24 hours) to prevent accidentally misconfigured values resulting in extremely long election timeouts?
I'd be interested in your thoughts on both of these. Do you prefer keeping the validation minimal (only ensuring it's greater than the heartbeat tick), or would you favor enforcing a practical operating range for election tick values?
There was a problem hiding this comment.
Hi @LakshimiRam-073 , Thanks for adding the validation. The fail-fast check on electionTick <= heartbeatTick makes sense to me since that's a hard correctness requirement anyway (raft.Config.validate() will panic if it's violated).
For the two follow-up questions, I'd personally keep the validation fairly minimal and avoid enforcing a practical range.
For the lower bound of 10, I don't think we should reject smaller values. One of the main reasons for exposing this flag is to give operators control over election timing, and there are valid use cases for running below 10. For example, integration/CI tests often benefit from much faster failover, single-node development setups don't have the same stability concerns, and some low-latency deployments may intentionally optimize for sub-second failover. etcd itself only requires electionTick > heartbeatTick. Also, since we run with PreVote: true, we're already protected from one of the more problematic failure modes where a flaky follower can repeatedly disrupt a healthy leader.
I'd also avoid adding an upper bound. Any limit we pick—24 hours or otherwise is ultimately arbitrary. If someone configures an extremely large election timeout, they've effectively decided they want leader changes to be very rare or even handled manually. That's unusual, but it could still be intentional. Hard limits in cases like this often end up causing more frustration than value.
That said, I don't love silently accepting very small values either. Since raft randomizes the timeout in [electionTick, 2*electionTick), setting electionTick=2 means an election can start after missing only a couple of heartbeats. At that point, something as simple as a GC pause or brief network hiccup could trigger unnecessary leader elections.
Instead of rejecting those values, I'd lean toward logging a warning:
if electionTick < 10 {
glog.Warningf("election-tick=%d gives a %dms election timeout. Values below 10 (1s) "+
"may cause spurious leader elections under GC pauses or network jitter.",
electionTick, electionTick*100)
}That way we preserve flexibility while still making operators aware of the tradeoffs.
There was a problem hiding this comment.
Updated in commit fix(raft): warn on negative and low election-tick values (view changes).
I kept the hard correctness check on electionTick <= heartbeatTick, and preserved flexibility for smaller valid values by logging a warning instead of rejecting them. Negative values now warn and fall back to the default, while 0 continues to mean unset/default.
panic with a cryptic. ( election tick > heartbeat issue)
|
|
||
| on: | ||
| pull_request: | ||
| pull_request_target: |
There was a problem hiding this comment.
A change to this file is scope-creep. Please revert, open a different PR with rationale for this change if you want.
There was a problem hiding this comment.
Reverted. Will open a separate PR with rationale if we want to pursue that change.
- Split negative and zero handling: negative values log a warning and default to 20; zero silently defaults to 20 (unset/zero-value case) - Return warning string from normalizeElectionTick instead of two bools - Warn (not reject) when election-tick < 10 per reviewer guidance
Reverting scope-creep changes to .github/workflows/labeler.yml. Will open a separate PR with rationale if needed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
b71204e to
7071d3b
Compare
| var warning string | ||
| electionTick, warning = normalizeElectionTick(electionTick) | ||
| if warning != "" { | ||
| glog.Warningf(warning) |
There was a problem hiding this comment.
The formatted version is not needed (and should be failing govet pass)... please just use glog.Warning
| recommendedElectionTick = 10 | ||
| ) | ||
|
|
||
| func normalizeElectionTick(electionTick int) (tick int, warning string) { |
There was a problem hiding this comment.
The election-tick=1 handling in normalizeElectionTick is inconsistent with how the function treats every other invalid value, and I'd pull it into line.
Right now a negative tick coerces to the default 20 and logs a warning, but electionTick == 1 calls glog.Fatalf and takes the process down. Those are both equally-invalid inputs, so it's odd that -100 gets a friendly default while a single fat-fingered 1 is fatal. An operator who typos election-tick=1 shouldn't lose the node over it.
The Fatalf also blocks the test. The table covers 0, -1, -100, 2, 10, 20, 100 but skips 1, and it has to: exercising that path would os.Exit the test binary. So the one genuinely-fatal branch is the one branch with no coverage.
The intent behind it is right. Failing with a clear message beats letting etcd/raft panic with something cryptic. But coercing 1 to the default and warning, exactly like the negative case, gets you that clear message while keeping the behavior consistent, fully unit-testable, and crash-free on a typo.
Suggest dropping the glog.Fatalf branch and folding 1 into the same coerce-and-warn path as the negatives.
Description
This PR adds
election-tickas a configurable sub-option to the existing--raftsuperflag for both Alpha and Zero nodes.Currently,
ElectionTickis hardcoded to20inconn/node.go:91(equivalent to 2 seconds at the 100ms tick interval). This makes it impossible for operators to control leader election priority across nodes deployed in heterogeneous network topologies -- for example, nodes spread across different network segments with varying inter-node RTTs (1-2ms within the same segment vs. 10-50ms across segments), or nodes behind different network switches with asymmetric bandwidth, latency, and jitter characteristics.Changes:
worker/server_state.go— Addedelection-tick=20toRaftDefaultsstring (preserves backward compatibility)conn/node.go— Readelection-tickfromx.WorkerConfig.Raftconfig instead of hardcoded valueelection-tick >= 10 * HeartbeatTickper etcd/raft library requirementsUse case: In deployments where Alpha nodes span different network segments with asymmetric latency, operators can now bias leadership toward low-latency nodes:
This ensures the Raft leader stays in the segment closest to application servers, avoiding unnecessary write latency degradation from suboptimal leader placement after elections.
Default behavior is unchanged — without specifying
election-tick, nodes use the previous value of20(2 second timeout).Checklist
Conventional Commits syntax, leading
with
fix:,feat:,chore:,ci:, etc.docs repo staged and linked here.