Skip to content

fix(platform-wallet): fail-closed on registration persist error (Found-017) [backport]#3659

Open
lklimek wants to merge 3 commits into
v3.1-devfrom
fix/found-017-v3.1-dev
Open

fix(platform-wallet): fail-closed on registration persist error (Found-017) [backport]#3659
lklimek wants to merge 3 commits into
v3.1-devfrom
fix/found-017-v3.1-dev

Conversation

@lklimek
Copy link
Copy Markdown
Contributor

@lklimek lklimek commented May 18, 2026

Semantic backport to v3.1-dev of the #3549-proven Found-017 fail-closed fix.

Root cause: register_wallet logged the registration persister.store(...) Err and fell through, then committed in-memory state and returned Ok. The wallet worked for the session but was silently lost on restart — write-acknowledged-but-not-durable.

Fix: on the registration store() Err, keep the tracing::error!, re-acquire the wallet-manager write guard, remove_wallet(&wallet_id) to roll back the in-memory insert_wallet, and return Err(WalletCreation(...)). Fail-closed, mirroring the same function's existing load_persisted / initialize_from_persisted rollback idiom. Reuses the existing WalletCreation variant — no new variant, no error-type churn.

The v3.1-dev region diverged ~62 lines from origin/feat so a textual cherry-pick would not apply; re-applied by hand as a semantic transform. Single remove_wallet fully unwinds — only insert_wallet is committed at the store() point (self.wallets.insert is downstream).

Validation: correctness already proven on #3549 (deterministic network-free pin + positive guard, QA-PASS). v3.1-dev has no found_017 e2e pin by design; here: cargo check --tests clean, clippy --lib -D warnings clean, cargo test --lib 121 passed / 0 failed, rustfmt clean. Pin coverage stays #3549-only.

Sibling PR to #3549.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced wallet registration error handling to ensure storage failures are properly detected and reported, with automatic rollback of incomplete operations to maintain wallet state consistency.

Review Change Stack

…ound-017) [backport]

Semantic backport of the #3549-proven Found-017 fail-closed fix.
Region diverged on v3.1-dev so the change was re-applied by hand.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c5c69e59-b3cb-4804-974a-f05ae8b482f1

📥 Commits

Reviewing files that changed from the base of the PR and between 9f3bfa3 and 40177b4.

📒 Files selected for processing (1)
  • packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs

📝 Walkthrough

Walkthrough

This PR tightens error handling in wallet registration by ensuring persistence failures are truly atomic. When wallet storage fails, the in-memory insert is rolled back and an error is returned, rather than silently continuing. Documentation is updated to reflect this load-bearing semantics.

Changes

Wallet Registration Persistence Error Handling

Layer / File(s) Summary
Persistence error handling and rollback
packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs
Atomicity semantics are documented as load-bearing: store failures roll back the in-memory wallet insert and abort registration. Implementation now removes the wallet from wallet_manager and returns PlatformWalletError::WalletCreation on persistence failure instead of only logging and continuing.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A wallet once lost in the store's failing arm,
Now rolls back with grace, no digital harm.
Atomic and sure, the registration stays true—
One hop, one flip, all persisted right through. 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(platform-wallet): fail-closed on registration persist error (Found-017) [backport]' directly and accurately describes the main change: fixing the wallet registration to fail when persistence fails rather than silently continuing.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/found-017-v3.1-dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added this to the v3.1.0 milestone May 18, 2026
@lklimek lklimek changed the title fix(rs-platform-wallet): fail-closed on registration persist error (Found-017) [backport] fix(platform-wallet): fail-closed on registration persist error (Found-017) [backport] May 19, 2026
@lklimek lklimek moved this to In Progress in Platform team May 21, 2026
@lklimek lklimek self-assigned this May 21, 2026
@lklimek lklimek marked this pull request as ready for review May 21, 2026 08:14
@lklimek lklimek requested a review from QuantumExplorer as a code owner May 21, 2026 08:14
@lklimek lklimek requested a review from shumkov May 21, 2026 08:15
@lklimek lklimek moved this from In Progress to In review / testing in Platform team May 21, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.15%. Comparing base (9f3bfa3) to head (40177b4).

Additional details and impacted files
@@             Coverage Diff              @@
##           v3.1-dev    #3659      +/-   ##
============================================
- Coverage     87.16%   87.15%   -0.01%     
============================================
  Files          2607     2606       -1     
  Lines        319420   319221     -199     
============================================
- Hits         278413   278216     -197     
+ Misses        41007    41005       -2     
Components Coverage Δ
dpp 87.67% <ø> (ø)
drive 85.95% <ø> (ø)
drive-abci 89.60% <ø> (ø)
sdk ∅ <ø> (∅)
dapi-client ∅ <ø> (∅)
platform-version ∅ <ø> (∅)
platform-value 92.17% <ø> (ø)
platform-wallet ∅ <ø> (∅)
drive-proof-verifier 49.16% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

✅ DashSDKFFI.xcframework built for this PR.

SwiftPM (host the zip at a stable URL, then use):

.binaryTarget(
  name: "DashSDKFFI",
  url: "https://your.cdn.example/DashSDKFFI.xcframework.zip",
  checksum: "8148ea041180f5244c720499314a8d1ccebdf4714fe3ee287e3bd6f44163e7db"
)

Xcode manual integration:

  • Download 'DashSDKFFI.xcframework' artifact from the run link above.
  • Drag it into your app target (Frameworks, Libraries & Embedded Content) and set Embed & Sign.
  • If using the Swift wrapper package, point its binaryTarget to the xcframework location or add the package and place the xcframework at the expected path.

Copy link
Copy Markdown
Collaborator

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Surgical backport that aligns the persist-error path with the two existing rollback idioms in register_wallet (lines 327-334 and 343-348). The new branch correctly unwinds the just-inserted wallet_manager entry before returning WalletCreation, preserving the fail-closed invariant required by Found-017. No correctness or consensus issues; remaining notes are about regression coverage and a now-tripled duplication pattern.

Note: Inline review posting hit GitHub HTTP 422, so these verified findings were posted in the top-level review body.

Reviewed commit: e7fb1d5

🟡 1 suggestion(s) | 💬 2 nitpick(s)

3 additional finding(s)

suggestion: No in-crate regression test pins the registration persist-failure rollback

packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs (line 281)

The new branch is the substance of the Found-017 fix, but the v3.1-dev backport ships without a unit test exercising it. The PR description explicitly defers pin coverage to #3549 on a sibling branch, leaving this code path protected only by reviewer memory on v3.1-dev. A targeted regression test that injects a PlatformWalletPersistence stub whose store returns Err and asserts (a) register_wallet returns Err(PlatformWalletError::WalletCreation(_)) and (b) the wallet_id is absent from wallet_manager after the call (e.g. a retry with insert_wallet succeeds) would lock the invariant directly onto v3.1-dev. Without it, a future refactor that reverts the rollback to the prior log-and-continue shape, or that adds another insert_wallet-style side effect between lines 217-280 not covered by this unwind, would silently regress Found-017 on this branch without CI catching it.

nitpick: Three near-identical write-lock + remove_wallet rollback blocks in one function

packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs (line 281)

After this PR, register_wallet contains the same two-line rollback (acquire wallet_manager.write().await, call remove_wallet, discard the Result) three times — at 287-288, 328-329, and 343-344 — each preceding a WalletCreation error return. The duplication itself was pre-existing, but adding a third occurrence raises the cost of any future failure point added between insert and the final commit: missing one unwind step is exactly the Found-017 failure mode. A small private async helper (e.g. rollback_inserted_wallet(&self, wallet_id)) called as self.rollback_inserted_wallet(wallet_id).await; return Err(...); at each site would centralize the invariant. Pure refactor, no semantic change — probably out of scope for a backport, but worth doing on the forward branch.

nitpick: Rollback discards remove_wallet Result silently

packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs (line 288)

let _ = wm.remove_wallet(&wallet_id); discards a Result. In this exact call site wallet_id was just returned by the immediately preceding wm.insert_wallet, so a NotFound is essentially impossible today. The discard also matches the established convention at lines 329 and 344. Flagging only because if remove_wallet ever grows fallible side-effects (event-bus deregistration, secondary index cleanup) the rollback would become a silent partial unwind across all three sites in lockstep. A tracing::error! on Err here would convert that into a loud failure without altering the public contract. Not blocking; existing convention is defensible.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

- [SUGGESTION] In `packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs`:281-293: No in-crate regression test pins the registration persist-failure rollback
  The new branch is the substance of the Found-017 fix, but the v3.1-dev backport ships without a unit test exercising it. The PR description explicitly defers pin coverage to #3549 on a sibling branch, leaving this code path protected only by reviewer memory on v3.1-dev. A targeted regression test that injects a `PlatformWalletPersistence` stub whose `store` returns `Err` and asserts (a) `register_wallet` returns `Err(PlatformWalletError::WalletCreation(_))` and (b) the wallet_id is absent from `wallet_manager` after the call (e.g. a retry with `insert_wallet` succeeds) would lock the invariant directly onto v3.1-dev. Without it, a future refactor that reverts the rollback to the prior log-and-continue shape, or that adds another `insert_wallet`-style side effect between lines 217-280 not covered by this unwind, would silently regress Found-017 on this branch without CI catching it.
- [NITPICK] In `packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs`:281-348: Three near-identical write-lock + remove_wallet rollback blocks in one function
  After this PR, `register_wallet` contains the same two-line rollback (acquire `wallet_manager.write().await`, call `remove_wallet`, discard the Result) three times — at 287-288, 328-329, and 343-344 — each preceding a `WalletCreation` error return. The duplication itself was pre-existing, but adding a third occurrence raises the cost of any future failure point added between insert and the final commit: missing one unwind step is exactly the Found-017 failure mode. A small private async helper (e.g. `rollback_inserted_wallet(&self, wallet_id)`) called as `self.rollback_inserted_wallet(wallet_id).await; return Err(...);` at each site would centralize the invariant. Pure refactor, no semantic change — probably out of scope for a backport, but worth doing on the forward branch.
- [NITPICK] In `packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs`:288-288: Rollback discards remove_wallet Result silently
  `let _ = wm.remove_wallet(&wallet_id);` discards a `Result`. In this exact call site `wallet_id` was just returned by the immediately preceding `wm.insert_wallet`, so a NotFound is essentially impossible today. The discard also matches the established convention at lines 329 and 344. Flagging only because if `remove_wallet` ever grows fallible side-effects (event-bus deregistration, secondary index cleanup) the rollback would become a silent partial unwind across all three sites in lockstep. A `tracing::error!` on `Err` here would convert that into a loud failure without altering the public contract. Not blocking; existing convention is defensible.

Copy link
Copy Markdown
Member

@QuantumExplorer QuantumExplorer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... I don't think this is correct... a usecase would be if someone has their wallet in their keychain, but the app is crashing on startup... we don't want to delete their wallet (risk of loosing funds).

@Claudius-Maginificent
Copy link
Copy Markdown
Collaborator

@QuantumExplorer Thanks for flagging — traced the call graph to verify the concern, and I don't think the scenario is reachable here. Detail:

remove_wallet only fires from register_wallet, which has exactly two callers (both creation-only, in manager/wallet_lifecycle.rs):

  • create_wallet_from_mnemonic (L88)
  • create_wallet_from_seed_bytes (L113)

Loading goes through a separate path. load_from_persistor inserts wallets directly via wm.insert_wallet() and never calls register_wallet, so a wallet already on disk (or in keychain → previously persisted) cannot enter this error arm on startup.

remove_wallet is in-memory only — it clears WalletManager, self.wallets, the shielded coordinator entry, and the identity sync registration. No persister write, no keychain delete. Since persister.store() just failed, there's nothing successfully persisted to roll back anyway.

Worst real consequence: user re-attempts wallet creation (re-enters the seed they typed seconds ago). No funds-loss risk — the fresh wallet was never persisted and the user still holds the seed.

The keychain-crash scenario you described would need a "load-from-keychain → re-register" path; that doesn't exist in this codebase (there's no keychain-load API that invokes register_wallet).

Happy to add an inline code comment documenting this invariant ("register_wallet only runs for first-time creation; loaded wallets bypass this arm") so the concern doesn't resurface in future review — say the word.

🤖 Co-authored by Claudius the Magnificent AI Agent

@lklimek
Copy link
Copy Markdown
Contributor Author

lklimek commented May 25, 2026

@QuantumExplorer on top of what Claudius investigated, I would add that from user perspective, it is very important to fail on save, other approach can lead to data loss (user thinks data is persisted but it's not), and inconsistency.

@lklimek lklimek requested a review from QuantumExplorer May 25, 2026 10:47
Copy link
Copy Markdown
Collaborator

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The current head keeps the new fail-closed rollback on store() failure, but register_wallet is still not fully transactional: once persistence succeeds, later restore failures can still return Err while leaving a durably stored wallet behind. The branch also still lacks a local regression test for the new persist-failure rollback path. No new valid findings were introduced by the e7fb1d52d1932954b1fc5b1cbce585a8bf21b767..40177b4870ce2e73f3c6d4c143ec21c929251f71 delta; the new shielded-wallet unregister logic in remove_wallet looks correct.

_Note: Inline posting failed (command failed (1): python3 scripts/review_poster.py dashpay/platform 3659 40177b4
STDOUT:

STDERR:
Traceback (most recent call last):
File "/Users/claw/.openclaw/workspace/scripts/review_poster.py", line 138, in
result = post_review(repo, pr_number, h), so I posted the same verified findings as a top-level review body._

Reviewed commit: 40177b4

🟡 2 suggestion(s)

2 additional finding(s)

suggestion: `register_wallet` can return `Err` after leaving the wallet durably persisted

packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs (line 281)

register_wallet persists the registration changeset before it runs the second-phase restore work (load_persisted() and initialize_from_persisted(...)). If either of those later steps fails, the code only removes the wallet from the in-memory manager and then returns WalletCreation, but there is no compensating delete in the persister API. That means callers can observe an apparent registration failure even though the wallet metadata and account snapshot were already committed and will be rehydrated on the next load_from_persistor(). This is a real partial-commit bug, not just a cleanup nit: the API reports all-or-nothing semantics, but the durable side effects are only partially rolled back.

suggestion: The new persist-failure rollback path still has no branch-local regression test

packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs (line 281)

This branch now removes the just-inserted wallet when self.persister.store(...) fails, but there is still no in-crate test that forces PlatformWalletPersistence::store to error and then proves a retry succeeds instead of hitting WalletAlreadyExists. Because this backport carries behavior that is different from the old branch and the only cited coverage lives elsewhere, the fix is still protected by reviewer memory rather than a local regression test.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

- [SUGGESTION] In `packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs`:281-347: `register_wallet` can return `Err` after leaving the wallet durably persisted
  `register_wallet` persists the registration changeset before it runs the second-phase restore work (`load_persisted()` and `initialize_from_persisted(...)`). If either of those later steps fails, the code only removes the wallet from the in-memory manager and then returns `WalletCreation`, but there is no compensating delete in the persister API. That means callers can observe an apparent registration failure even though the wallet metadata and account snapshot were already committed and will be rehydrated on the next `load_from_persistor()`. This is a real partial-commit bug, not just a cleanup nit: the API reports all-or-nothing semantics, but the durable side effects are only partially rolled back.
- [SUGGESTION] In `packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs`:281-293: The new persist-failure rollback path still has no branch-local regression test
  This branch now removes the just-inserted wallet when `self.persister.store(...)` fails, but there is still no in-crate test that forces `PlatformWalletPersistence::store` to error and then proves a retry succeeds instead of hitting `WalletAlreadyExists`. Because this backport carries behavior that is different from the old branch and the only cited coverage lives elsewhere, the fix is still protected by reviewer memory rather than a local regression test.

@lklimek lklimek added the ready for final review Ready for the final review. If AI was involved in producing this PR, it has already had a reviewer. label May 25, 2026
@Claudius-Maginificent
Copy link
Copy Markdown
Collaborator

@thepastaclaw Non-atomicity finding confirmed — persister.store() does commit before the restore phase, and steps 3/4 failures do leave an orphan record. Tracked as #3736 with the failure-mode matrix and three fix shapes; deferring out of this backport per the surgical-scope rationale.

🤖 Co-authored by Claudius the Magnificent AI Agent

@thepastaclaw
Copy link
Copy Markdown
Collaborator

Thanks for tracing this. That matches my read: #3659 covers the specific -failure path (no durable commit, so in-memory rollback is enough), while the restore-phase orphan case is a broader partial-commit bug after a successful durable write. #3736 is the right follow-up for that.

I’m fine with keeping the restore-phase fix out of this surgical backport; from my side the remaining branch-local regression-test note is non-blocking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready for final review Ready for the final review. If AI was involved in producing this PR, it has already had a reviewer.

Projects

Status: In review / testing

Development

Successfully merging this pull request may close these issues.

5 participants