Conversation
Create new background task for probing, add its initialization to the builder. Add new trait ProobingStrategy with two default implementations. Total amount of sats in current non-finished probes is tracked, it is changed on ProbeSucceded and ProbeFailed events.
There are 3 probing tests: - check that probing lock liquidity amount changes - test that new probes are not fired if the max locked liquidity is reached (for example if on of the nodes goes down) - probing perfomance test which sets up 4 probing nodes (random strategy, high-degree without penalty, high-degree with penalty, control node without probing). A network of several nodes is set up; these nodes make payments to one another and probing nodes observe their behaviour. Next the scorer estimates are printed out. Scorer channel estimates are exposed for the purposes of the test. Scoring parameters (probing penalty) is exposed to be set up during node builiding. Random probing strategy constructs the maximal possible route (up to the set limit) instead of failing when the next hop is not possible to construct.
Probing locked amount involves hops' fees, not only the last hop. Remove usage of cursor in favor of Atomicusize for pseudorandom pick of channels for random strategy. Add check that min_probe amount is indeed smaller than max_probe_amount. Add liquidity_limit_multiplier to probing config instead of hardocded None value.
Remove Prober::run method in favor of distinct function. Tidy probing tests.
|
👋 Thanks for assigning @tnull as a reviewer! |
|
🔔 1st Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 2nd Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 3rd Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 4th Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 5th Reminder Hey @tnull! This PR has been waiting for your review. |
|
🔔 6th Reminder Hey @tnull! This PR has been waiting for your review. |
There was a problem hiding this comment.
Hi @randomlogin, thanks for the work on this! I've reviewed the first two commits:
I've left a bunch of inline comments addressing configuration and public API, commit hygiene, testing infrastructure, and test flakiness.
In summary:
- A couple of items are exposed publicly that seem like they should be scoped to probing or gated for tests only (see
scoring_fee_paramsinConfigandscorer_channel_liquidityonNode). - The probing tests duplicate existing test helpers (
setup_node,MockLogFacadeLogger). Reusing and extending what's already intests/common/would reduce duplication and keep the test file focused on the tests themselves. test_probe_budget_blocks_when_node_offlinehas a race condition where the prober dispatches probes before the baseline capacity is measured, causing the assertion between the baseline and stuck capacities to fail. Details in the inline comment.- A few nits about commit hygiene, import structure, and suggestions for renaming stuff.
Also needs to be rebased.
src/builder.rs
Outdated
| use crate::payment::asynchronous::om_mailbox::OnionMessageMailbox; | ||
| use crate::peer_store::PeerStore; | ||
| use crate::probing; | ||
| use crate::probing::ProbingStrategy; |
There was a problem hiding this comment.
nit: ProbingStrategy is imported but never used because it qualified at all usage sites. You could either remove it and use the qualified form, or do the opposite. The rest of builder.rs follows the convention of importing types and using short names and I suggest doing the same here just for consistency.
|
|
||
| use std::sync::atomic::{AtomicU64, Ordering}; | ||
| use std::sync::{Arc, Mutex}; | ||
| use std::time::Duration; |
There was a problem hiding this comment.
nit: Import ordering doesn't match the codebase's convention. The rest of the repo follows: std -> external deps -> crate (see e.g. builder.rs, event.rs) but here it's reversed. Please reorder to match.
| pub struct HighDegreeStrategy { | ||
| network_graph: Arc<Graph>, | ||
| /// How many of the highest-degree nodes to cycle through. | ||
| pub top_n: usize, |
There was a problem hiding this comment.
Could top_n be renamed to num_top_nodes? The latter reads less generic to me but up to you to modify or not.
src/probing.rs
Outdated
| let graph = self.network_graph.read_only(); | ||
|
|
||
| // Collect (pubkey, channel_count) for all nodes. | ||
| // wtf it does why we need to iterate here and then sort? maybe we can go just once? |
There was a problem hiding this comment.
Here, and in other locations, we have what looks like leftover development notes. Could you fix-up its removal into this commit so it doesn't end up in git history?
| async_payments_role: None, | ||
| pathfinding_scores_sync_config, | ||
| recovery_mode, | ||
| probing_strategy: None, |
There was a problem hiding this comment.
nit: Most optional fields here are initialized as local variables before the struct literal (e.g. let chain_data_source_config = None;). Could probing_strategy follow the same pattern for consistency? While at it, async_payments_role has the same inconsistency. Would be nice to align both.
| } | ||
| } | ||
|
|
||
| // helpers |
There was a problem hiding this comment.
There are a lot of helpers defined here. Could we move the general-purpose ones to tests/common/ and the probing-specific to tests/common/probing.rs to keep this file focused on the tests? For example, configure_chain_source duplicates the chain source setup already in setup_node (tests/common/mod.rs).
| .set_probing_interval(Duration::from_millis(PROBING_INTERVAL_MILLISECONDS)) | ||
| .set_max_probe_locked_msat(MAX_LOCKED_MSAT); | ||
| build_and_start(builder, config) | ||
| } |
There was a problem hiding this comment.
build_and_start and configure_chain_source (and other build_node_* functions) duplicate logic that setup_node (chain source setup, store selection, custom logger, node start) already handles. Could we augment setup_node / TestConfig to support probing configuration and reuse it here instead?
|
|
||
| /// Test change of locked_msat amount | ||
| #[tokio::test(flavor = "multi_thread", worker_threads = 2)] | ||
| async fn test_probe_budget_increments_and_decrements() { |
There was a problem hiding this comment.
We can drop the test_ prefix since #[tokio::test] already marks it as one.
|
|
||
| /// Test that probing stops if the upper locked in flight probe limit is reached | ||
| #[tokio::test(flavor = "multi_thread", worker_threads = 2)] | ||
| async fn test_probe_budget_blocks_when_node_offline() { |
There was a problem hiding this comment.
How about renaming to exhausted_probe_budget_blocks_new_probes and then updating the documentation to highlight how exhaustion occurs, i.e. node going offline?
| .find(|ch| ch.counterparty_node_id == node_b.node_id()) | ||
| .map(|ch| ch.outbound_capacity_msat) | ||
| .expect("A→B channel not found after B stopped"); | ||
| assert!( |
There was a problem hiding this comment.
This assertion fails with:
HTLC not visible in channel state: capacity unchanged (988999000 msat)I think this is because the prober dispatches a probe every 250ms in the background as soon as a viable path exists, which happens before initial_capacity is read. By the time the test reads initial_capacity, a probe is already in-flight, so the value is already reduced. The wait_until resolves quickly because locked_msat is already > 0. Thus, by the time we stop node_b, the same probe is still in-flight, and stuck_capacity reads the same value as initial_capacity. Possibly why the assertion fails.
Added a probing service which is used to send probes to estimate channels' capacities.
Related issue: #765.
Probing is intended to be used in two ways:
For probing a new abstraction
Proberis defined and is (optionally) created during node building.Prober periodically sends probes to feed the data to the scorer.
Prober sends probes using a ProbingStrategy.
ProbingStrategy trait has only one method:
fn next_probe(&self) -> Option<Probe>; every tick it generates a probe, whereProberepresents how to send a probe.To accommodate two different ways the probing is used, we either construct a probing route manually (
Probe::PrebuiltRoute) or rely on the router/scorer (Probe::Destination).Prober tracks how much liquidity is locked in-flight in probes, prevents the new probes from firing if the cap is reached.
There are two probing strategies implemented:
Random probing strategy, it picks a random route from the current node, the route is probed via
send_probe, thus ignores scoring parameters (what hops to pick), it also ignoresliquidity_limit_multiplierwhich prohibits taking a hop if its capacity is too small. It is a true random route.High degree probing strategy, it examines the graph and finds the nodes with the biggest number of (public) channels and probes routes to them using
send_spontaneous_preflight_probeswhich uses the current router/scorer.The former is meant to be used on payment nodes, while the latter on probing nodes. For the HighDegreeStrategy to work it is recommended to set
probing_diversity_penalty_msatto some nonzero value to prevent routes reuse, however it may fail to find any available routes.There are three tests added:
Example output (runs for ~1 minute, needs
--nocaptureflag):For performance testing I had to expose the scoring data (
scorer_channel_liquidity).Also exposed
scoring_fee_params: ProbabilisticScoringFeeParameterstoConfig.TODOs: