Skip to content

Deferred follow-ups for spec.genesis.overrides (v1 shipped — 8 items with un-defer triggers) #272

@bdchatham

Description

@bdchatham

Problem

The v1 implementation of SeiNodeDeployment.spec.genesis.overrides shipped in PRs sei-protocol/seictl#181, sei-protocol/seictl#182 (release v0.0.50), and #270. The feature works end-to-end: a 4-node test cluster can boot with staking.params.unbonding_time = 600s to unblock SIP-3 (EVM treasury contract) testing.

The coral design round + post-implementation review identified eight follow-up items that were deliberately deferred. Each has an explicit un-defer trigger so future-us can self-evaluate whether the cost has crossed the threshold to warrant work.

This issue is a single tracking surface so the deferred slices don't get lost between sessions.

Impact

If left untracked, these items recur as ad-hoc rediscoveries: a teammate hits a key-path typo, files a fresh issue against the symptom, and the design history sits in PR comments nobody reads. Documenting now lets:

  • Future contributors read one place to understand the v2 expansion surface.
  • Anyone hitting an un-defer trigger find the prior analysis instantly instead of redesigning from scratch.
  • Reviewers triaging similar requests (e.g., "can we add typed StakingParams?") to point at this issue rather than re-explain.

No active customer impact today — v1 unblocks SIP-3. Cost of inaction is silent design rot.

Relevant experts

  • platform-engineer — CRD shape, developer experience, named-preset layering
  • kubernetes-specialist — webhook validator, controller plan, sidecar contract integrity
  • blockchain-developer — cosmos-sdk ValidateGenesis() integration, sidecar internals

Deferred items (with un-defer triggers)

Controller / CRD side (sei-k8s-controller)

  1. Typed module structs (StakingParams, GovParams, …) in the CRD. v1 uses map[string]apiextensionsv1.JSON with dotted snake_case keys — flexible but no autocomplete, no kubectl explain discoverability, typos surface late.
    Un-defer when: 4+ teammates use overrides, OR a 2nd common module pattern emerges that's repeatedly copy-pasted across manifests.

  2. Webhook shape validator. Validates dotted snake_case key shape + JSON round-trip at admission time, before the sidecar receives the request.
    Un-defer when: sidecar errors discovered too late (someone wastes >30 min chasing a key-path typo).

  3. Named presets (e.g., genesis.preset: "fast-test"). A named profile that bundles common timer-shrink values for test clusters in one field, instead of 12 lines of overrides per cluster.
    Un-defer when: 3+ test-cluster recipes get copy-pasted across manifests.

Sidecar side (seictl)

  1. Cosmos-level ValidateGenesis() after override application. Currently the sidecar applies overrides verbatim; cosmos per-module ValidateGenesis() only runs at seid start InitChain. Bad values (negative duration, malformed coin) only surface there.
    Un-defer when: a test breaks at first-boot from a semantically-invalid override that should've been caught earlier.
    Notes: requires wiring the cosmos module manager into the sidecar purely for validation.

  2. Atomic genesis write (temp file + rename). addMissingGenesisAccounts → writeBackAuthAndBank → ExportGenesisFile uses os.WriteFile (truncate-then-write). Mid-write crash leaves a corrupted genesis.json; the marker isn't written so retry runs, but the corrupt file is read on the very first step and the task wedges until manual cleanup. Pre-existing pattern, not introduced by the v1 overrides PR.
    Un-defer when: anyone hits a wedged-genesis incident.

  3. Typo-detection logging in sidecar. staking.params.unbondingTime (camelCase typo) currently writes an orphan sibling key in the genesis JSON. Cosmos silently ignores unknown fields. The sidecar could log applied N overrides; M leaf-keys did not pre-exist as defense-in-depth without changing semantics.
    Un-defer when: an operator hits this silent-success failure mode.

  4. Brittle null check in setNestedRaw at sidecar/tasks/genesis_overrides.go:87. string(raw) != "null" doesn't trim whitespace. Replace with bytes.Equal(bytes.TrimSpace(raw), []byte("null")).
    Un-defer when: the next hand on this file lands (cosmetic, batch-fix-friendly).

  5. setNestedRaw re-marshals every intermediate level — O(depth × bytes) JSON ops per override. Invisible at current scale; could matter if someone applies large object overrides (whole voting_params trees).
    Un-defer when: a measurable perf hit appears.

Out of scope

  • Per-node genesis variation. v1 picked SND-level dispatch (overrides go into the genesis.json that all peers download). Per-node variation would be a different feature, not an evolution of this one. File separately if a real use case emerges.
  • Post-genesis (gov-proposal-style) param changes. v1's spec.genesis is CEL-immutable after creation. Changing params on a running chain requires a cosmos governance proposal, which is sei-chain's responsibility, not this controller's.
  • Override surface beyond app_state (e.g., consensus_params, validators array). v1 walks into app_state[<module>] only. Other top-level genesis fields would need separate plumbing.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions