refactor(chain,core)!: replace CanonicalIter with sans-IO CanonicalTask + ChainQuery trait#2038
refactor(chain,core)!: replace CanonicalIter with sans-IO CanonicalTask + ChainQuery trait#2038oleonardolima wants to merge 16 commits intobitcoindevkit:masterfrom
CanonicalIter with sans-IO CanonicalTask + ChainQuery trait#2038Conversation
d851ba6 to
c02636d
Compare
c02636d to
78c0538
Compare
677e25a to
9e27ab1
Compare
evanlinjin
left a comment
There was a problem hiding this comment.
Great work.
This is my initial round of reviews.
Are you planning to introduce topological ordering in a separate PR?
| let chain_tip = chain.tip().block_id(); | ||
| let task = graph.canonicalization_task(chain_tip, Default::default()); | ||
| let canonical_view = chain.canonicalize(task); |
There was a problem hiding this comment.
What do you think about the following naming:
CanonicalizationTask->CanonicalResolver.TxGraph::canonicalization_task->TxGraph::resolver.LocalChain::canonicalize->LocalChain::resolve.
There was a problem hiding this comment.
I've been thinking about this, I agree with the CanonicalResolver, though the TxGraph::resolver and LocalChain::resolve seems a bit off.
What do you think about (?):
- CanonicalResolver
- TxGraph::canonical_resolver
- LocalChain::canonical_resolve
There was a problem hiding this comment.
Alright, this one is a bit outdated.
As of 031de40 we have:
CanonicalizationTask->CanonicalTask; and also newCanonicalViewTask.TxGraph::canonicalization_task->TxGraph::canonical_task.LocalChain::canonicalizeis still the same, though we now also haveLocalChain::canonical_view.
I'm fine with these names for now, though if there's no consensus on those we can discuss/change in a follow-up PR.
9e27ab1 to
f6c8b02
Compare
a276c37 to
b7f8fba
Compare
The new API introduces a sans-io behavior, separating the canonicalization logic from `I/O` operations, it should be used as follows: 1. Create a new `CanonicalizationTask` with a `TxGraph`, by calling: `graph.canonicalization_task(params)` 2. Execute the canonicalization process with a chain oracle (e.g `LocalChain`, which implements `ChainOracle` trait), by calling: `chain.canonicalize(task, chain_tip)` - Replace `CanonicalView::new()` constructor with internal `CanonicalView::new()` for use by `CanonicalizationTask` - Remove `TxGraph::try_canonical_view()` and `TxGraph::canonical_view()` methods - Add `TxGraph::canonicalization_task()` method to create canonicalization tasks - Add `LocalChain::canonicalize()` method to process tasks and return `CanonicalView`'s - Update `IndexedTxGraph` to delegate canonicalization to underlying `TxGraph` BREAKING CHANGE: Remove `CanonicalView::new()` and `TxGraph::canonical_view()` methods in favor of task-based approach
- Adds `CanonicalReason`, `ObservedIn`, and `CanonicalizationParams` to `canonical_task.rs` module, instead of using the ones from `canonical_iter.rs`. - Removes the `canonical_iter.rs` file and its module declaration. BREAKING CHANGE: `CanonicalIter` and all its exports are removed
…icalizationTask` Introduce a new `ChainQuery` trait in `bdk_core` that provides an interface for query-based operations against blockchain data. This trait enables sans-IO patterns for algorithms that need to interact with blockchain oracles without directly performing I/O. The `CanonicalizationTask` now implements this trait, making it more composable and allowing the query pattern to be reused for other blockchain query operations. - Add `ChainQuery` trait with associated types for Request, Response, Context, and Result - Implement `ChainQuery` for `CanonicalizationTask` with `BlockId` as context BREAKING CHANGE: `CanonicalizationTask::finish()` now requires a `BlockId` parameter Co-Authored-By: Claude <noreply@anthropic.com>
Make `ChainRequest`/`ChainResponse` generic over block identifier types to enable reuse beyond BlockId. Move `chain_tip` into `ChainRequest` for better encapsulation and simpler API. - Make `ChainRequest` and `ChainResponse` generic types with `BlockId` as default - Add `chain_tip` field to `ChainRequest` to make it self-contained - Change `ChainQuery` trait to use generic parameter `B` for block identifier type - Remove `chain_tip` parameter from `LocalChain::canonicalize()` method - Rename `ChainQuery::Result` to `ChainQuery::Output` for clarity BREAKING CHANGE: - `ChainRequest` now has a `chain_tip` field and is generic over block identifier type - `ChainResponse` is now generic with default type parameter `BlockId` - `ChainQuery` trait now takes a generic parameter `B = BlockId` - `LocalChain::canonicalize()` no longer takes a `chain_tip` parameter Co-authored-by: Claude <noreply@anthropic.com> refactor(chain): make `LocalChain::canonicalize()` generic over `ChainQuery` Allow any type implementing `ChainQuery` trait instead of requiring `CanonicalizationTask` specifically. Signed-off-by: Leonardo Lima <oleonardolima@users.noreply.github.com>
- Unify both `unprocessed_anchored_txs` and `pending_anchored_txs` in a single `unprocessed_anchored_txs` queue. - Changes the `unprocessed_anchored_txs from `Iterator` to `VecDeque`. - Removes the `pending_anchored_txs` field and it's usage. - Collects all `anchored_txs` upfront instead of lazy iteration.
- Add new `CanonicalStage` enum for tracking the different canonicalization phases/stages. - Add new `try_advance()` method for stage progression. - Add new `is_transitive()` helper to `CanonicalReason`. - Change internal `confirmed_anchors` to `direct_anchors` for better clarity. - Update the `resolve_query()` to handle staged-based processing. Co-authored-by: Claude <noreply@anthropic.com>
3906047 to
f6b2565
Compare
Inline all stage-processing logic into `next_query()`, removing the separate `try_advance()` method, `process_*_txs()` helpers, and `is_finished()` from the `ChainQuery` trait. Add `AssumedTxs` as an explicit first stage and `CanonicalStage::advance()` for centralized stage transitions. Document the `ChainQuery` protocol contract.
…`Canonical<A, P>` Separate concerns by splitting `CanonicalizationTask` into two phases: 1. `CanonicalTask` determines which transactions are canonical and why (`CanonicalReason`), outputting `CanonicalTxs<A>`. 2. `CanonicalViewTask` resolves reasons into `ChainPosition`s (confirmed vs unconfirmed), outputting `CanonicalView<A>`. Make `Canonical<A, P>`, `CanonicalTx<P>`, and `FullTxOut<P>` generic over the position type so the same structs serve both phases. Add `LocalChain::canonical_view()` convenience method for the common two-step pipeline. Renames: `CanonicalizationTask` -> `CanonicalTask`, `CanonicalizationParams` -> `CanonicalParams`, `canonicalization_task()` -> `canonical_task()`, `FullTxOut::chain_position` -> `FullTxOut::pos`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Query::tip()` The chain tip is constant for the lifetime of a query, so it belongs on the trait rather than being redundantly copied into every request. `ChainRequest` is now a type alias for `Vec<B>`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ChainResponse` These types only ever used `BlockId`, so the generic parameter added unnecessary complexity. All three are now hardcoded to `BlockId`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Assumed transactions bypass the `AnchoredTxs` stage and are marked
canonical immediately with `CanonicalReason::Assumed`. Previously,
`view_task()` only queued anchor checks for transitive txs, so directly
assumed txs (`Assumed { descendant: None }`) were never checked and
always resolved to `Unconfirmed` even when they had confirmed anchors.
Queue all `Assumed` txs for anchor checks in `view_task()` and look up
`direct_anchors` for both `Assumed` variants in `finish()`.
Fixes bitcoindevkit#2088
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…onical_view_task.rs` Move shared types (`CanonicalTx`, `Canonical`, `CanonicalView`, `CanonicalTxs`) and convenience methods into `canonical.rs`. Keep only the phase-2 task (`CanonicalViewTask`) in `canonical_view_task.rs`. Also rename `FullTxOut` to `CanonicalTxOut` and move it to `canonical.rs`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Can we get rid of Edit: Removed in 37eb136 |
CanonicalIter with sans-io CanonicalizationTaskCanonicalIter with sans-IO CanonicalTask + ChainQuery trait
|
I've also updated the PR description. |
I'm fine with removing it in the same PR, thanks for the commit.
Great, thank you! |
|
|
||
| // Determine chain position based on reason | ||
| let chain_position = match reason { | ||
| CanonicalReason::Assumed { .. } => match self.direct_anchors.get(txid) { |
There was a problem hiding this comment.
You seem to be ignoring the descendant, but it may be the case that the descendant has a direct anchor, making this tx Confirmed transitively.
In case the reason is Assumed, to determine the ChainPosition you would want to:
- First look for a direct anchor of the current tx. If a direct anchor exists, the position must be Confirmed (and transitively None)
- Otherwise look at the descendant txid. If the descendant has a direct anchor, then the position is Confirmed transitively by the descendant's anchor
- If the descendant has no direct anchor, then the position is Unconfirmed. Likewise Unconfirmed if there's no anchor and no descendant
There was a problem hiding this comment.
You seem to be ignoring the descendant, but it may be the case that the descendant has a direct anchor, making this tx Confirmed transitively.
I think this depends of the priority we give to the different CanonicalReasons. As it is expressed right now, CanonicalReason::Asumed has higher priority than the others.
In case the reason is Assumed, to determine the ChainPosition you would want to:
- First look for a direct anchor of the current tx. If a direct anchor exists, the position must be Confirmed (and transitively None)
In case we raise CanonicalReason::Anchor in the hierarchy, then is as easy as to make AssumedTxs also return a query for these txids.
- Otherwise look at the descendant txid. If the descendant has a direct anchor, then the position is Confirmed transitively by the descendant's anchor
Again, in case we raise CanonicalReason::Anchor in the hierarchy, this case is partially covered by mark_canonical when walking up the Ancestor queue:
|_: usize, tx: Arc<Transaction>| -> Option<Txid> {
let this_txid = tx.compute_txid();
let this_reason = if is_starting_tx {
is_starting_tx = false;
reason.clone()
} else {
// This is an ancestor being marked transitively
reason.to_transitive(starting_txid)
};
use crate::collections::hash_map::Entry;
let canonical_entry = match self.canonical.entry(this_txid) {
// Already visited tx before, exit early.
Entry::Occupied(_) => return None, // Here we can compare and replace `CanonicalReason::Assumed` by `this_reason` if `this_reason` is `CanonicalReason::Anchor`.
Entry::Vacant(entry) => entry,
};
- If the descendant has no direct anchor, then the position is Unconfirmed. Likewise Unconfirmed if there's no anchor and no descendant
Are you proposing a different category here? Or with Unconfirmed you refer to CanonicalReason::Assumed?
There was a problem hiding this comment.
@nymius I think you're conflating two different things here. ValuedMammal's comment is about phase 2 position resolution (ChainPosition), not about phase 1 canonicality priority (CanonicalReason).
By the time we're in CanonicalViewTask::finish(), canonicality is already decided — we know the tx is canonical and we know its reason is Assumed. The question is purely: what is its ChainPosition — Confirmed or Unconfirmed?
An Assumed tx can still be Confirmed if it or its descendant has an anchor in the best chain. The current code checks direct_anchors.get(txid) (the tx's own anchor) but ignores the descendant field entirely. If the descendant has a resolved anchor in direct_anchors, the tx should be Confirmed { anchor: descendant_anchor, transitively: Some(descendant) }.
This has nothing to do with raising CanonicalReason::Anchor in the hierarchy or changing phase 1 processing. And "Unconfirmed" in ValuedMammal's comment refers to ChainPosition::Unconfirmed, not a CanonicalReason.
There was a problem hiding this comment.
@ValuedMammal thanks for pointing out that bug! I do see a logical flaw with your proposed fix though.
Given a scenario where we have txs (A, B and C), where B spends A and C spends B, C is assumed to be canonical, C is unconfirmed and B is anchored to the best chain.
With your logic when looking at A, the descendant field always points to the original assumed tx (C), not to the intermediate tx (B) that actually has the anchor. So checking direct_anchors.get(descendant) only works when the assumed tx itself has the anchor, not when an intermediate ancestor does.
There was a problem hiding this comment.
You'd have to initiate a walk_descendants routine to find the first descendant having a direct anchor, if one exists, otherwise accept that the ChainPosition can't be relied on for Assumed txs.
There was a problem hiding this comment.
@ValuedMammal Great, thanks for finding this one! I noticed that unfortunately we also have this issue in both master and release/chain-0.23.x where the scenario above would resolve to: txA->unconfirmed ; txB->confirmed ; txC->unconfirmed.
I'm working on a test to assert for these scenarios.
You'd have to initiate a
walk_descendantsroutine to find the first descendant having a direct anchor, if one exists, otherwise accept that the ChainPosition can't be relied on for Assumed txs.
I'll give a try to this one, but it's valid to note that we can only walk the descendants that are included in the canonical txs.
There was a problem hiding this comment.
I think this depends of the priority we give to the different
CanonicalReasons. As it is expressed right now,CanonicalReason::Asumedhas higher priority than the others.
I have been thinking in depth about this, and @nymius has a good comment here. At least for now we can focus on handling these scenarios in the position resolution.
However we probably need to discuss/review the primitives for the canonicalization algorithm in the future, in these scenarios where a given txA has been resolved to CanonicalReason::Assumed but could also resolved to CanonicalReason::Anchored (both transitively), should the assumed always have higher priority ?
| let canonical_txs = self.canonicalize(tx_graph.canonical_task(tip, params)); | ||
| self.canonicalize(canonical_txs.view_task(tx_graph)) |
There was a problem hiding this comment.
The canonical task and the view task only differ in the stages of canonicalization they handle and the finished output. You should reduce this to a single task architecture. The rationale in 6d9a7d6 is to separate concerns, but it's really the same concern, canonicalizing the TxGraph into a consistent view. That'll prevent having the algorithm spread across multiple modules.
There was a problem hiding this comment.
I agree with this, the canonicalization task doesn't have a meaning without the view task.
There was a problem hiding this comment.
I disagree that these are the same concern. They solve different problems and are diverging further over time.
Phase 1 (CanonicalTask) is conflict resolution — it walks the tx DAG, detects conflicts/self-double-spends, and determines which transactions are canonical and why (CanonicalReason). This is the complex part (~500 lines).
Phase 2 (CanonicalViewTask) is view construction — it takes the resolved canonical set and builds the final CanonicalView with concrete ChainPositions. This is a simpler transformation (~200 lines) that is growing its own responsibilities: #2139 adds MTP computation (fetching headers, computing median-time-past per confirmed height), and we plan to add topological sorting here too. None of that belongs in the conflict resolution phase.
The intermediate CanonicalTxs output is also intentionally useful on its own — callers who want to inspect canonicalization reasons without paying for full view construction can stop after phase 1.
A merged task would need 6 stages (AssumedTxs → AnchoredTxs → SeenTxs → LeftOverTxs → ResolvingPositions → Finished), a larger state machine mixing conflict resolution state with position resolution state, and a more complex finish(). The net result would be lines of interleaved concerns instead of two focused types.
- ignore `canonical_task.rs` test module in coverage. - reuse `canonical_view` in `test_tx_graph_conflicts.rs`
- add new `test_canonical_view_task.rs` to handle different scenarios of chain position resolution. - TODO: assumed canonical txs should be confirmed if it's direct descendant is anchored.
fixes #1816
Description
Replaces the iterator-based
CanonicalIterwith a two-phase sans-IO canonicalization pipeline, and introduces a genericChainQuerytrait inbdk_coreto decouple canonicalization from chain sources.Old API:
New API:
Phase 1:
CanonicalTaskDetermines which transactions are canonical by processing them in stages:
CanonicalParamsProduces a
CanonicalTxs<A>containing each canonical transaction with itsCanonicalReason.Phase 2:
CanonicalViewTaskResolves
CanonicalReasons into concreteChainPositions (confirmed height or unconfirmed with last-seen), producing the finalCanonicalView<A>.Both phases implement the
ChainQuerytrait, so any chain source can drive them via the samenext_query/resolve_queryloop.Key structural changes
ChainQuerytrait added tobdk_core— a generic sans-IO interface (next_query→resolve_query→finish) for any algorithm that needs to verify blocks against a chain source.ChainOracletrait removed — replaced byChainQuery.LocalChain::canonicalize()now drives anyChainQueryimplementor.Canonical<A, P>generic container —CanonicalTxs<A>(phase 1 output) andCanonicalView<A>(phase 2 output) are type aliases overCanonical<A, P>.canonical_view.rssplit intocanonical.rs(types:Canonical,CanonicalTx,CanonicalTxOut) andcanonical_view_task.rs(phase 2 task).canonical_iter.rsreplaced bycanonical_task.rs.Notes to the reviewers
The changes are split into multiple commits for easier review. Also depends on #2029.
Changelog notice
Checklists
All Submissions:
New Features: