Skip to content

fix: serve /peers from the same address cache as /providers#153

Open
lidel wants to merge 6 commits into
mainfrom
fix/peers-use-cached-peerbook
Open

fix: serve /peers from the same address cache as /providers#153
lidel wants to merge 6 commits into
mainfrom
fix/peers-use-cached-peerbook

Conversation

@lidel
Copy link
Copy Markdown
Member

@lidel lidel commented May 30, 2026

Problem

/routing/v1/peers/{peerid} queried peer routing on every request and returned an empty result whenever the DHT could not locate the peer, even when someguy already held that peer's addresses. A relay-dependent, NAT'd peer that announces content appears under /routing/v1/providers/{cid} with addresses but is not findable by DHT peer routing, so /peers answered {"Peers":[]} for a peer whose addresses someguy already knew. The two endpoints drew from different sources for the same peer.

Fixing this surfaced a second problem: cached addresses only accumulated. Provider records, DHT gossip, and successive identifies each unioned into the address book, so a peer could collect 150+ (or more) stale entries (dead relay circuits, outdated certhashes, rotated NAT ports) that lingered until each entry's TTL lapsed.

Fix

  • /peers is now cache-first, like /providers: it answers from the cached address book and the libp2p host peerstore (which the DHT fills with provider addresses during FindProviders) before falling back to a DHT lookup. someguy is a caching routing proxy, not a libp2p node, so it favors a low-latency answer from recently seen, actively probed peers.
  • A completed identify prunes a peer's cached set to its current advertised addresses (signed peer record, else identify listen addresses) plus any live outbound-connection address, instead of unioning forever. Inbound-connection remotes are skipped: they are ephemeral source ports that cannot be dialed back. An identify carrying no usable addresses leaves the existing set untouched.
  • peer_addr_lookups is recorded once per /peers request, not twice on a cache miss, so its hit-rate is meaningful again.

docs/peer-address-caching.md documents both stores, the refresh loop, and the per-endpoint read paths. The improvement is cache-driven: it helps when the peer was recently seen as a provider, but a cold instance that has never seen the peer still returns not-found.

lidel added 2 commits May 29, 2026 23:09
someguy is a caching routing proxy, so /routing/v1/peers/{peerid} now
answers from the cached address book and host peerstore before falling
back to a DHT lookup, matching /providers. A relay-dependent peer that
is absent from peer routing but recently seen as a provider is no longer
answered with an empty result.

- cachedRouter.FindPeers: check the peerbook first, fall back to peer
  routing only on a miss, and enrich address-less hits from cache
- cachedAddrBook.GetCachedAddrs: fall back to the host peerstore, which
  the DHT fills with provider addresses during FindProviders
- cachedAddrBook.CacheAddrs: persist addresses observed in provider
  records so later peer lookups can serve them
Add docs/peer-address-caching.md explaining the two address stores, how
the cache is filled and refreshed by the probe loop, and the cache-first
read path on /providers and /peers, with mermaid diagrams. Note that
caching requires a DHT-backed instance and is bypassed with --dht=disabled.

- README: add a Documentation section linking all docs, and replace
  cid.contact references with the Delegated Routing V1 spec
- AGENTS.md: add repo orientation, build/test commands, code map, and
  doc index
@lidel lidel requested a review from a team May 30, 2026 00:09
lidel added 4 commits May 30, 2026 02:19
Cached peer addresses previously only accumulated: provider records, DHT
gossip, and successive identifies each unioned into the address book, so a
peer could collect outdated certhashes, dead relay circuits, and rotated
NAT ports that never expired until each entry's TTL lapsed.

On a completed identify, replace the peer's stored set with its current
advertised addresses (signed peer record, else identify listen addresses)
unioned with any live-connection address, so a reachable peer collapses
back to its current set each refresh. An identify carrying no usable
addresses returns early and leaves the existing set untouched.
Cache-first FindPeers recorded the metric twice on a cache miss: once when
the cache-first lookup missed, then again in the post-DHT enrich step when
the record already carried addresses. That made sum(origin=peers) exceed
the request count and the hit-rate uncomputable.

Read the cache directly in the enrich step so it no longer records a
lookup; FindPeers now emits exactly one peer_addr_lookups increment per
request (hit when served cache-first, miss when it falls back to the DHT).

Also point the CHANGELOG doc link at an absolute URL so it resolves when
copied into GitHub release notes.
On identify, the live-connection addresses unioned into the cache came
from every connection's RemoteMultiaddr, including inbound ones. An
inbound connection's remote address is the peer's ephemeral source port,
which is not dialable, so caching it reintroduced the junk the
identify-time prune is meant to remove. Keep only outbound (and
direction-unknown) connection remotes.

Rename the host-peerstore fallback field from peerstore to hostPeerstore
so it no longer shadows the imported peerstore package and reads as the
counterpart to addrBook (someguy's own book vs the libp2p host's).
Addresses came back in nondeterministic order because the peerstore holds
them in a map, so repeated requests for the same peer or provider returned
the same addresses shuffled differently.

Sort by bytes in filterPrivateMultiaddr, which sanitizeRouter already runs
on every record from every router, so providers, peers, and closest-peers
responses are stable regardless of whether the addresses came from the
cache, the DHT, or a delegated HTTP backend.
@lidel lidel force-pushed the fix/peers-use-cached-peerbook branch from 06fb299 to fa15bdf Compare May 30, 2026 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant