Skip to content

Persistent MonitorEvents#4491

Draft
valentinewallace wants to merge 16 commits intolightningdevkit:mainfrom
valentinewallace:2026-03-persistent-mon-evs
Draft

Persistent MonitorEvents#4491
valentinewallace wants to merge 16 commits intolightningdevkit:mainfrom
valentinewallace:2026-03-persistent-mon-evs

Conversation

@valentinewallace
Copy link
Contributor

As part of #4482, we're looking into changing our architecture -- currently, the Channel{Manager} is responsible for managing the resolution of off-chain HTLCs, and the ChannelMonitor is responsible for them once they're on-chain. See the issue description but there's complexity that results from this design.

Quoting the issue, "Instead, we want to do all resolution in ChannelMonitors (in response to ChannelMonitorUpdates) and pass them back to ChannelManager in the form of MonitorEvents (similar to how HTLCs are resolved after channels are closed). In order to have reliable resolution, we'll need to keep MonitorEvents around in the ChannelMonitor until the ChannelManager has finished processing them - adding a new MonitorEvent resolution path through a new method (rather than via ChannelMonitorUpdates)."

Here we don't add any new MonitorEvents but add support for making the ones we do have persistent across restarts until they are explicitly ack'd, via a new chain::Watch API.

  • update commit messages

@ldk-reviews-bot
Copy link

👋 Hi! I see this is a draft PR.
I'll wait to assign reviewers until you mark it as ready for review.
Just convert it out of draft status when you're ready for review!

match channel_monitor.pending_monitor_events.iter().find(|ev| match ev {
// Strip IDs from pending_monitor_events for backward-compatible TLV field 5.
// TODO: can we remove the legacy monitor event variant?
let pending_monitor_events: Vec<MonitorEvent> = match channel_monitor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref please

(35, channel_monitor.is_manual_broadcast, required),
(37, channel_monitor.funding_seen_onchain, required),
(39, channel_monitor.next_monitor_event_id, required),
(41, monitor_event_ids, required_vec),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISTM we could do this simpler by just switching based on the persistent_events_enabled flag - if its set, write the full pending_monitor_events to a new even TLV, if its not, store the old thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this, but I don't think it's simpler because MonitorEvent is MaybeReadable and there isn't a writeable impl for tuples with elements that are MaybeReadable, just Readable. So we'd need some kind of read wrapper or something. Wdyt?

next_monitor_event_id: u64,
/// Tracks which monitor events have already been returned to the caller during this runtime
/// session. In-memory only — on restart this is empty, so all unacked events are re-provided.
provided_monitor_event_ids: HashSet<u64>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bleh, just move the provided events into a new vec.

FailHTLC {
htlc_id: u64,
err_packet: msgs::OnionErrorPacket,
monitor_event_id: Option<(u64, ChannelId)>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the code cleaner to have some type for this? Maybe just a MonitorUpdateCompletionAction but could also be some MonitorEventSource?

(17, in_flight_monitor_updates, option),
(19, peer_storage_dir, optional_vec),
(21, async_receive_offer_cache, (default_value, async_receive_offer_cache)),
(23, persistent_monitor_events, (default_value, false)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the logic isn't complete, should we write this as an even tlv on the write side and deliberately not implement read for it yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean use an even tlv and conditionally write it only if it's set to true? That makes sense to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the plan for upgrades/downgrades?

  • Until ~everything's in place, use even TLVs for things but don't write them in prod, and error in prod if they are set on read
  • Then have a version or two that allows the even TLVs to be set on read, but doesn't write them, to allow downgrades
  • Then have a version that allows the even TLVs to be missing, runs the legacy reconstruction logic once, then sets persistent_monitor_events in the manager/monitors at the end of manager read so even TLVs will be henceforth written

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants