Skip to content

chore(P2P): Only increment error metric if transport infos of all registry versions are empty#10408

Open
eichhorl wants to merge 1 commit into
masterfrom
eichhorl/empty_subnet_nodes
Open

chore(P2P): Only increment error metric if transport infos of all registry versions are empty#10408
eichhorl wants to merge 1 commit into
masterfrom
eichhorl/empty_subnet_nodes

Conversation

@eichhorl

@eichhorl eichhorl commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

The peer manager maintains connections to all peers that are in the subnet record in any registry version between the "oldest version in use" and the latest locally available registry version.

The "oldest registry version in use" is determined by the CUP. In case of a subnet creation, this CUP is a registry CUP that contains the registry version just before the subnet was created (see here).

This means that when the peer manager attempts to make connections to peers according to the "oldest registry version in use", the subnet doesn't actually exist yet, which then triggers the empty_list_of_node_records error. This is fine however, because it will find the correct peers in the subsequent registry version (which is guaranteed to exist, otherwise we would not have the registry CUP).

To improve the error signal of this metric, we change it such that it is only incremented if all registry versions do not contain transport infos, meaning there are no peers shared with the new subnet topology.

@github-actions github-actions Bot added the chore label Jun 8, 2026
@eichhorl eichhorl changed the title chore(P2P): Only increment error if transport infos of all registry versions are empty chore(P2P): Only increment error metric if transport infos of all registry versions are empty Jun 9, 2026
@eichhorl eichhorl marked this pull request as ready for review June 9, 2026 13:50
@eichhorl eichhorl requested a review from a team as a code owner June 9, 2026 13:50
@pierugo-dfinity

pierugo-dfinity commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

This works well. Alternatively, I had thought of making an exception in CatchUpPackage::get_oldest_registry_version_in_use which would return the block's validation context's registry version instead of the summary's registry version in case cup.height == 0, because it is the registry version at which the CUP was added to the registry (which is the same that added the subnet record).

I agree that this would introduce some complexity which could be avoided with your simpler change. But implementing it this way also makes more sense functionality-wise since the consumers of get_oldest_registry_version_in_use are all interested in membership (P2P, orchestrator for detecting unassignment, firewall), so they all expect to see a SubnetRecord at that registry version (in practice not the firewall, but it still does conceptually).

WDYT?

@alin-at-dfinity alin-at-dfinity left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn, only noticed Pierugo's comment after I hit Approve. My apologies, I thought I was just being helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants