anounciness metric for peer selection#53
Conversation
6a60f59 to
24d0083
Compare
c2b2518 to
1efece4
Compare
bca4d1a to
142a39e
Compare
Adds a property test that verifies `announciness peerMetric` agrees with a standalone pure model at every metric-state snapshot emitted during the simulation. Key design choices: - `runSigSubmissionV2WithMetric` (IOSim-specific, leaves the generic `runSigSubmissionV2` untouched) wires a real `PeerMetric` into each inbound peer and registers a `traceTVarIO` callback on the metric TVar. - Both peer events (`TraceTxInboundReceivedTxIds`, `TraceTxInboundAddedToMempool`) and metric snapshots are emitted as a single `SimTraceEvent = Either SimInboundEvent SimMetricSnapshot` type. A single `selectTraceEventsDynamicWithTime` call then extracts them in their original in-thread order, preserving the `sig_add → snapshot → sig_add → snapshot` interleaving that two separate extraction passes would destroy. - The pure model (`PureModelState`, `updatePureModel`, `expectedAnnounciness`) is decoupled from the implementation and mirrors `reportSigImpl` independently. - `peerMetricVar` (field accessor) exported from `PeerMetric` for testing; constructor remains unexported. - `TraceTxInboundReceivedTxIds` trace added at both sigid-receipt sites in `sigSubmissionInbound`. - `SimPeerAddr ~ Int` enforced via type equality so a type change causes a compile error rather than silently empty traces. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces an “announciness” peer metric used by the peer selection policy to demote hot peers that are slower to announce signatures, and wires metric reporting into the sig-submission inbound client. It also adds property tests around the new metric and updates a couple CI formatting scripts for broader fd compatibility.
Changes:
- Add a new
DMQ.Diffusion.PeerSelection.PeerMetricmodule (announciness metric + STM state + reporting hooks) and integrate it into NodeKernel / NodeToNode and peer selection demotion. - Extend
sigSubmissionInboundto record sig-id announcement times and report accepted/rejected outcomes into the metric. - Add QuickCheck / IOSim property tests validating metric behavior and update CI scripts to use
fdfind/fd.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/ci/run-stylish-haskell.sh | Use fdfind/fd variable instead of hardcoding fd. |
| scripts/ci/run-nixpkgs-fmt.sh | Same fdfind/fd resolution for nix formatting. |
| dmq-node/test/Test/DMQ/SigSubmission/App.hs | Adds an IOSim property that cross-checks metric snapshots against a pure model; wires metrics into the simulated inbound. |
| dmq-node/test/Test/DMQ/PeerSelection/PeerMetric.hs | New property tests for the metric’s pure implementation (pruning, announciness, competing peers, etc.). |
| dmq-node/test/Main.hs | Registers the new peer-metric test group. |
| dmq-node/src/DMQ/SigSubmissionV2/Inbound.hs | Threads per-peer local metric state; records sig-id receipt time; reports mempool results into the metric. |
| dmq-node/src/DMQ/Protocol/SigSubmission/Type.hs | Fixes a typo in a comment. |
| dmq-node/src/DMQ/Policy.hs | Adds peerMetricConfiguration (1h window). |
| dmq-node/src/DMQ/NodeToNode.hs | Wires metric reporting into sig-submission inbound; erases peer metric state on disconnect via finally. |
| dmq-node/src/DMQ/Diffusion/PeerSelectionPolicy.hs | Uses announciness scores to bias hot peer demotion (lower score demoted first). |
| dmq-node/src/DMQ/Diffusion/PeerSelection/PeerMetric.hs | New metric implementation module (local state, global state, reporting, announciness extraction). |
| dmq-node/src/DMQ/Diffusion/NodeKernel/Types.hs | Adds peerMetric to NodeKernel. |
| dmq-node/src/DMQ/Diffusion/NodeKernel.hs | Instantiates the shared peerMetric in newNodeKernel. |
| dmq-node/dmq-node.cabal | Exposes new modules and adds psqueues dependency; includes new tests. |
| dmq-node/changelog.d/20260512_150257_coot_sig_metric.md | Adds a changelog fragment describing the new feature. |
| dmq-node/app/Main.hs | Switches to PeerSelectionPolicy module and passes NodeKernel’s peerMetric into policy. |
| cabal.project | Pins ouroboros-network from a specific git tag and adds an allow-newer entry. |
Comments suppressed due to low confidence (1)
dmq-node/src/DMQ/Diffusion/PeerSelection/PeerMetric.hs:246
- As implemented, credit is only assigned when a peer’s signature is accepted (
reportSigImpluses that peer’sLocalPeerMetricStateto determine the announcement time). This does not (by itself) implement “give a score to the peer that announced it first” across all peers unless the system guarantees that the accepting peer is always the earliest announcer. Either clarify this in the PR description/docs, or adjust the metric to track the earliest announcement globally across peers and use that when a signature becomes valid.
reportSigImpl
:: forall sigid peeraddr.
Ord sigid
=> PeerMetricConfiguration
-> LocalPeerMetricState sigid
-> TraceLabelPeer peeraddr (sigid, TxMempoolResult)
-> PeerMetricState sigid peeraddr
-> (LocalPeerMetricState sigid, PeerMetricState sigid peeraddr)
reportSigImpl
config
localState
(TraceLabelPeer _peeraddr (sigid, TxRejected))
metricState
=
( localState'
, prune config mbTime metricState
)
where
localState' :: LocalPeerMetricState sigid
(mbTime, localState') = unreportSigId sigid localState
reportSigImpl
config
localState
(TraceLabelPeer peeraddr (sigid, TxAccepted))
st
=
( localState'
, PeerMetricState { metricState = metricState' }
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| set -euo pipefail | ||
|
|
||
| # First, try to find the 'fd' command | ||
| FD="$(which fdfind 2>/dev/null || which fd 2>/dev/null)" |
|
|
||
| fd -e nix -X nixpkgs-fmt | ||
| # First, try to find the 'fd' command | ||
| FD="$(which fdfind 2>/dev/null || which fd 2>/dev/null)" |
| -- | An internal function which prunes the `PeerMetricState` keeping only | ||
| -- `timeWindowToKeep` of entries. | ||
| -- | ||
| -- Note: we can trust the time, since it's coming from the host | ||
| -- `getMonotonicTime`. | ||
| prune :: Ord sigid | ||
| => PeerMetricConfiguration | ||
| -> Maybe Time | ||
| -- ^ time when sigid was inserted to `LocalPeerMetricState` | ||
| -> PeerMetricState sigid peeraddr | ||
| -> PeerMetricState sigid peeraddr | ||
| prune PeerMetricConfiguration { timeWindowToKeep } | ||
| mbTime | ||
| PeerMetricState { | ||
| metricState | ||
| } | ||
| = PeerMetricState { | ||
| metricState = | ||
| case mbTime of | ||
| Nothing -> metricState | ||
| Just t -> snd $ OrdPSQ.atMostView ((-timeWindowToKeep) `addTime` t) metricState | ||
| } |
There was a problem hiding this comment.
If the protocol becomes stale, we preserve the current information until connectivity resumes, which is a feature not a bug.
This change would require changes in PeerSelectionPolicy API: ight now each policy runs in STM m rather than m.
List of changes
This PR introduces announcyness metric for peer selection which is used to
decide which hot peers we should demote.
For each peer we count how many valid signatures were announced before any
other peer announced it. This is implemented into two steps: first we account
at which time a peer announce a given signature id, then once we receive the
corresponding signature, and it turns out to be valid, we give a score of one
to the peer that announced it first. Each score is kept for an hour (the same
as churn rate).
Checklist