From 27c1ea0f4d870f6c6ce383df9e40dfb9b6607e0d Mon Sep 17 00:00:00 2001 From: Maksym Arutyunyan Date: Mon, 23 Mar 2026 14:22:54 +0100 Subject: [PATCH 1/6] docs: add stable-structures deep-dive doc --- docs/src/SUMMARY.md | 1 + docs/src/introduction/deep-dive.md | 391 +++++++++++++++++++++++++++++ 2 files changed, 392 insertions(+) create mode 100644 docs/src/introduction/deep-dive.md diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md index ecc463c0..0d355bdf 100644 --- a/docs/src/SUMMARY.md +++ b/docs/src/SUMMARY.md @@ -3,6 +3,7 @@ - [Introduction](./introduction/introduction.md) - [Design Principles](./introduction/design-principles.md) - [Available Data Structures](./introduction/available-data-structures.md) + - [Deep Dive Overview](./introduction/deep-dive-overview.md) - [Concepts](./concepts/concepts.md) - [The Memory Trait](./concepts/memory-trait.md) diff --git a/docs/src/introduction/deep-dive.md b/docs/src/introduction/deep-dive.md new file mode 100644 index 00000000..0b17c3a4 --- /dev/null +++ b/docs/src/introduction/deep-dive.md @@ -0,0 +1,391 @@ +# Stable Structures Deep Dive + +## Segment 1 — Why this library exists (10 min) + +The Internet Computer (IC) runs canister smart contracts. When a canister is upgraded, its heap is wiped. The conventional fix is to serialize all state to stable memory in a `pre_upgrade` hook and deserialize it in `post_upgrade`. This works for small state but does not scale: the serialization itself costs cycles, and a bug in either hook can make the canister permanently non-upgradable. + +`stable-structures` eliminates both problems by keeping data structures permanently resident in stable memory. There is nothing to serialize on upgrade and no upgrade hooks to write. + +[Design principles](./design-principles.md) baked into every structure: + +- **Radical simplicity** — the simplest design that solves the problem +- **Backward compatibility** — every header starts with a magic string and a layout version, so new library versions can always read old data +- **No `pre_upgrade` hooks** — structures must not require migration on upgrade +- **Limited blast radius** — a bug in one structure cannot corrupt another +- **No reallocation** — moving data in bulk is too expensive in cycles; all growth happens in place +- **Multi-memory compatibility** — the design works with multiple stable memories, ensuring forward compatibility with upcoming IC features + +The six structures the library ships: + +| Structure | Description | Type support | Memories needed | +|------------|-----------------------------------|---------------------|------------------| +| `Cell` | Single serializable value | Bounded + Unbounded | 1 | +| `BTreeMap` | Ordered key-value store | Bounded + Unbounded | 1 | +| `BTreeSet` | Ordered set of unique keys | Bounded + Unbounded | 1 | +| `Vec` | Growable array | Bounded only | 1 | +| `Log` | Append-only variable-size entries | Bounded + Unbounded | 2 (index + data) | +| `MinHeap` | Priority queue | Bounded only | 1 | + +## Segment 2 — Core abstractions (15 min) + +### 2a. The Memory trait (`src/lib.rs:52-93`) + +Everything in the library is generic over a single four-method trait: + +```rust +/// Abstraction over a WebAssembly-style linear memory. +pub trait Memory { + /// Returns the current size in pages (1 page = 64 KiB). + fn size(&self) -> u64; + + /// Grows by pages, returns the previous size, or -1 on failure. + fn grow(&self, pages: u64) -> i64; + + /// Reads bytes. + fn read(&self, offset: u64, dst: &mut [u8]); + + /// Writes bytes. + fn write(&self, offset: u64, src: &[u8]); +} +``` + +Concrete implementations: + +- `Ic0StableMemory` — wraps the IC system API, only compiled for `wasm32` +- `VectorMemory` — a `Vec` in heap, used in tests and locally +- `FileMemory` — file-backed memory using standard file I/O, useful for offline development and persistence +- `DefaultMemoryImpl` — resolves to `Ic0StableMemory` in `wasm32`, `VectorMemory` otherwise + +`RestrictedMemory` (`src/lib.rs:243-308`) is a public `Memory` adapter that exposes a fixed page range of a larger memory as its own address space starting at 0. It is the simpler alternative to `MemoryManager` for cases where each structure's maximum size is known upfront — covered in more detail in section 2c. + +### 2b. The Storable trait (`src/storable.rs:13-72`) + +Stable structures are generic and work only with raw bytes — they have no knowledge of the types stored in them. `Storable` is the bridge: it tells a structure how to convert a value to and from bytes. Any type you want to store must implement it. The library already provides implementations for the most common types; custom types require a manual implementation: + +```rust +pub trait Storable { + fn to_bytes(&self) -> Cow<'_, [u8]>; + fn into_bytes(self) -> Vec; + fn from_bytes(bytes: Cow<[u8]>) -> Self; + const BOUND: Bound; +} +``` + +The two serialization methods serve different call sites: + +- **`to_bytes`** is for reads — it borrows `self` and can return `Cow::Borrowed`, a zero-copy slice, which is ideal for lookups and iteration. +- **`into_bytes`** is for writes — `insert` must own the value's bytes as they travel through the tree and get stored in a node. For types like `Vec` or `String` whose serialized form *is* their internal buffer, `to_bytes` would return `Cow::Borrowed`, and calling `.into_owned()` on that always clones. `into_bytes(self)` moves the buffer directly instead — no allocation. For types with no owned buffer to move (primitives, fixed-size structs), the correct fallback is simply `self.to_bytes().into_owned()`. + +The extra required method adds one line of boilerplate for each `Storable` impl, but eliminates a guaranteed heap allocation on every `insert` for the most common value types. + +The `BOUND` constant is the key design decision a user must make: + +- **`Bound::Unbounded`** — no size constraints; the structure stores a length prefix before each value. Safest default for types with `String`s or `Vec`s. +- **`Bound::Bounded { max_size: u32, is_fixed_size: bool }`** — `max_size` is enforced at runtime via `to_bytes_checked()`. Setting `is_fixed_size: true` eliminates the length prefix, saving bytes per entry. **You cannot increase `max_size` after deployment without corrupting data.** + +The library ships `Storable` implementations for all primitives (`u8` through `u128`, `f32`/`f64`, `bool`, `[u8; N]`), `String`, `Vec`, `Principal`, `Option`, and tuples. + +Note: `Storable` says nothing about the serialization format. Users commonly use CBOR (`ciborium`), protobuf, or Candid inside `to_bytes`/`from_bytes`. See `docs/src/schema-upgrades.md` for patterns for adding fields safely. + +### 2c. The MemoryManager (`src/memory_manager.rs`) + +Each stable structure requires exclusive ownership of its memory — sharing causes corruption. The naive alternative, carving stable memory into static regions via `RestrictedMemory`, has two problems: you must know the size limit upfront, and the full region is paid for even when mostly empty. + +`MemoryManager` eliminates both problems. It presents each structure with a `VirtualMemory` that has no upfront size limit and grows on demand. Underneath, it divides the real stable memory into 128-page buckets allocated as needed and interleaved freely across virtual memories — so total stable memory usage stays proportional to actual data, not declared limits. + +``` +1) NAIVE (RestrictedMemory) — limits declared upfront, full region allocated immediately + + Stable memory + ┌──────────────────────────────┬──────────────────────────────┐ + │ RestrictedMemory 0 (fixed) │ RestrictedMemory 1 (fixed) │ + │ ▓▓▓░░░░░░░░░░░░░░░░░░░░░░░ │ ▓▓░░░░░░░░░░░░░░░░░░░░░░░░ │ + │ ~15% used, rest wasted │ ~10% used, rest wasted │ + └──────────────────────────────┴──────────────────────────────┘ + +2) WITH MemoryManager — no limits, buckets allocated on demand + + VirtualMemory 0 [▓▓▓▓] ▓ = VM0 bucket + VirtualMemory 1 [▒▒] ▒ = VM1 bucket + ▼ + Stable memory + ▓▓▒▒▓▓▓▓▒▒▓▓······················· + └──────────────┘└─────────────────── + interleaved unallocated +``` + +The first page of the real memory holds the manager's own header: + +``` +magic "MGR" + version ↕ 4 bytes +number of allocated buckets ↕ 2 bytes +bucket size in pages ↕ 2 bytes +per-memory page counts ↕ 255 × 8 bytes +bucket ownership table ↕ 1 byte per bucket (value = MemoryId) +``` + +Each `VirtualMemory` is identified by a `MemoryId` (a `u8`, up to 255 supported) — a stable, persistent handle that always refers to the same virtual memory across upgrades. Call `memory_manager.get(MemoryId::new(n))` to obtain one and pass it to a stable structure. + +Each `VirtualMemory` presents a contiguous address space even though its physical buckets can be scattered. A read/write call translates logical page offsets through the bucket table to absolute pages in the underlying memory. + +Usage pattern every canister should follow: + +```rust +thread_local! { + static MEMORY_MANAGER: RefCell> = RefCell::new( + MemoryManager::init(DefaultMemoryImpl::default()) + ); + + static MAP: RefCell> = RefCell::new( + StableBTreeMap::init( + MEMORY_MANAGER.with(|m| m.borrow().get(MemoryId::new(0))) + ) + ); +} +``` + +See `examples/src/basic_example/src/lib.rs` for the minimal working canister. + +### 2d. Internal allocators (contributor-level) + +Memory allocation is invisible to users of stable-structures — structures interact only through the `Memory` trait and have no way to express "free this region." It is only relevant when working on the internals of a specific structure, where the choice of allocation strategy directly shapes the implementation. + +The key question that drives each structure's strategy: can holes appear in its memory? + +#### No allocator: direct memory access + +**`Cell`** stores a single value and accepts both bounded and unbounded types. There is only ever one value — when it changes, the old bytes are overwritten in-place. No slots, no holes, no allocator needed; the structure reads and writes directly to its memory. + +**`Log`** also accepts both bounded and unbounded types. It is strictly append-only: entries are written sequentially and nothing can be modified or removed. Holes are structurally impossible, so no allocator is needed. (Log uses two memories — one for the index of byte offsets, one for the data — which is why it requires two `MemoryId`s from the `MemoryManager`.) + +#### No allocator: fixed-size slots + +**`Vec`** and **`MinHeap`** are currently bounded-only. With a fixed `max_size`, every element occupies an equal-size slot at a predictable offset (`DATA_OFFSET + i * SLOT_SIZE`), and all mutations happen at the tail — pushes append, pops shrink, overwrites replace in-place. No holes, no allocator needed. Supporting unbounded types would require variable-size entries, which breaks fixed offsets: you could no longer find element `i` without scanning all prior entries. Tracking positions would require an index similar to `Log`, and reclaiming space after a removal would require a custom allocator — so unbounded `Vec` is a significantly more complex structure. + +#### Custom allocator: free-list + +**`BTreeMap`** requires a custom allocator regardless of whether bounded or unbounded types are used (V1, which is bounded-only, has it too). The reason is that B-tree rebalancing — splits, merges, and deletes — frees nodes at **arbitrary positions** throughout the memory. These holes must be tracked and reused. `BTreeMap` handles this with an internal free-list chunk allocator at `src/btreemap/allocator.rs`, located at a fixed offset inside the BTreeMap's memory right after its header. + +The allocator divides remaining memory into equal-size chunks. Free chunks form a singly-linked list: allocation pops the head, deallocation pushes back onto it — both O(1). When the free list is empty, the memory grows and a new chunk is appended. + + +## Segment 3 — Lifecycle, schema upgrades, and migrations (10 min) + +### 3a. Lifecycle across upgrades + +Every stable structure has two constructors: + +- `new(memory)` — writes a fresh magic header and initialises an empty structure +- `init(memory)` — checks for the magic header; loads the existing structure if found, creates a new one otherwise + +Always use `init` in a canister. Because stable memory survives upgrades, calling `init` on next deployment finds the existing data and resumes from it — no `pre_upgrade`/`post_upgrade` needed. The magic header also carries a layout version, so new library versions can always read data written by older ones. + +### 3b. Layout versioning in practice: BTreeMap V1 → V2 + +`BTreeMap` is currently the only structure that has shipped two layout versions, and its migration is a concrete example of the backward-compatibility principle above: + +- **V1** supports only `Bound::Bounded` types. Page size is derived from `max_key_size` and `max_value_size` stored in the header. +- **V2** adds support for `Bound::Unbounded` types via explicit page sizes and overflow pages — nodes can chain multiple pages when a value exceeds the page size. + +Migration from V1 to V2 is **transparent and non-breaking**: calling `BTreeMap::init()` on an existing V1 map automatically upgrades it to V2 on first load — existing data is preserved, no user action required. Under the hood, `init()` calls `load_helper(memory, migrate_to_v2: true)`, which re-interprets the stored `max_key_size`/`max_value_size` as a `DerivedPageSize` for the V2 allocator. Loading a V2 map as V1 is rejected at startup. + +### 3c. Schema upgrades + +Stable structures don't enforce a serialization format. The recommended pattern for evolving types is to use a flexible format (e.g. CBOR via `ciborium`) and `Bound::Unbounded`: + +```rust +impl Storable for Asset { + fn to_bytes(&self) -> Cow<'_, [u8]> { /* CBOR encode */ } + fn into_bytes(self) -> Vec { /* CBOR encode */ } + fn from_bytes(bytes: Cow<[u8]>) -> Self { /* CBOR decode */ } + const BOUND: Bound = Bound::Unbounded; +} +``` + +Adding a field is then safe with `#[serde(default)]` — old records decode without error and the new field gets its default. For fields with no sensible default, use `Option`. See `docs/src/schema-upgrades.md` for worked examples. + +**Warning:** if you used `Bound::Bounded`, never increase `max_size` after deployment — existing node pages were sized to the old value and enlarging it corrupts them. Migrating from `Bounded` to `Unbounded` is safe; the reverse is not. + +### 3d. Data migrations + +When an in-place field addition isn't enough — e.g. changing the key type or restructuring the value layout entirely — data must be migrated from one structure to another. + +For anything beyond a trivial dataset, migration cannot happen in a single upgrade call. The IC enforces a per-round instruction limit, so even a moderately large structure will trap if you try to read-transform-write it all at once. + +The practical approach is to run the migration incrementally across many canister update calls: + +1. Create the new structure under a fresh `MemoryId` alongside the old one. +2. Each update call migrates a small batch of records from old to new, tracking progress in a version field or migration cursor. +3. During migration, both structures are live. A routing layer directs reads and writes to whichever structure owns each record — unmigrated records go to the old structure, already-migrated ones to the new. +4. Once all records are migrated, the routing layer is dropped and the old structure is cleared. + +This is the pattern NNS-dapp used when migrating accounts to stable memory: two schemas active simultaneously, new writes applied to both during the transition, migration driven by a periodic job chunk by chunk. + +The cost to be aware of: both structures occupy stable memory simultaneously (~2× peak usage), and after the old structure is cleared its buckets remain permanently assigned to its `MemoryId` — they cannot be reclaimed or reused. Budget for this when planning large schema changes. + +### 3e. MemoryManager limitations and the path to bucket reclamation + +The inability to reuse freed buckets is a known limitation rooted in a structural invariant of the current MemoryManager: **buckets within each virtual memory must be stored in ascending order by ID**. Because the header encodes only the owning `MemoryId` per bucket (1 byte), there is no room to store an explicit ordering — the order is implied by bucket position. To load a virtual memory correctly, the runtime simply scans for all buckets belonging to that `MemoryId` and traverses them in ID order. + +This makes safe reuse impossible in the typical migration layout. When structure A occupies buckets 0–99 and is cleared, structure B (buckets 100–199) cannot absorb A's freed buckets — they have lower IDs than B's current maximum, so inserting them would violate the ascending invariant and corrupt B's data on the next load. + +#### Attempted fix: conservative bucket reuse + +A partial fix, **conservative bucket reuse**, was implemented and then [reverted](https://github.com/dfinity/stable-structures/pull/396). It allowed reuse only of freed buckets with IDs *higher* than the growing virtual memory's current maximum — a constraint that is almost never satisfied in practice, since A allocates first and therefore always has lower IDs than B. + +#### Alternative design: explicit linked list of buckets + +The proper solution requires a new header layout. The alternative design replaces implicit ID ordering with an **explicit linked list**: each bucket stores a 4-byte pointer to the next bucket in its virtual memory's chain. Freeing a virtual memory then simply nulls out its head pointer, making all its buckets immediately available for any new allocation regardless of their IDs. + +The redesign also removes the current 32,768-bucket cap (`MAX_NUM_BUCKETS`), raising it to 2^32 and lifting the effective stable memory ceiling from 256 GiB to well beyond current IC capacity. + +The tradeoff: this is a **breaking change**. The on-disk header format is incompatible with the current layout, so existing canisters will need a one-time migration when the new MemoryManager ships. + +## Segment 4 — Code walkthrough: BTreeMap::insert (25 min) + +`StableBTreeMap` is the most complex and most commonly used structure. Walking an insert from the public API down to raw byte writes shows every major mechanism. + +### 4a. File layout + +| File | Purpose | +|---|---| +| `src/btreemap.rs` | Public API, header format, `init`/`load`, `insert`/`get`/`remove` | +| `src/btreemap/allocator.rs` | Free-list chunk allocator for node pages | +| `src/btreemap/node.rs` | In-memory `Node` representation, load/save dispatch | +| `src/btreemap/node/v1.rs` | Node serialization for bounded types (legacy) | +| `src/btreemap/node/v2.rs` | Node serialization for bounded and unbounded types | +| `src/btreemap/iter.rs` | Range iteration | + +### 4b. Memory layout on disk (`src/btreemap.rs:1-50`) + +Address 0 in the memory given to `BTreeMap`: + +``` +"BTR" magic ↕ 3 bytes +layout version ↕ 1 byte (1 = V1, 2 = V2) +max_key_size or page_size ↕ 4 bytes +max_value_size or marker ↕ 4 bytes +root node address ↕ 8 bytes +length / element count ↕ 8 bytes +reserved ↕ 24 bytes +Allocator header ↕ starts at offset 52 (ALLOCATOR_OFFSET) +Node pages ... +``` + +### 4c. The Allocator (`src/btreemap/allocator.rs`) + +The allocator is a free-list of same-size chunks. The chunk size is determined at `BTreeMap::new()` time: + +- If both `K` and `V` are `Bounded`: `page_size = max_node_size * 3/4` (covers ~70% of real-world nodes at 8/11 capacity) +- Otherwise: `page_size = 1024` bytes (`DEFAULT_PAGE_SIZE`) + +Each chunk on the free list contains a `ChunkHeader` (magic `"CHK"`, next-free pointer) followed by the usable data area. Allocation pops from the head; deallocation pushes back. + +### 4d. BTreeMap::init (`src/btreemap.rs:274-290`) + +``` +if memory.size() == 0 → BTreeMap::new (writes a fresh header) +else → check for "BTR" magic and call BTreeMap::load +``` + +`BTreeMap::load` reads the header, detects the layout version (V1 or V2), and can transparently migrate V1 to V2 on first load. + +### 4e. BTreeMap::insert — the critical path + +Trace in `src/btreemap.rs` around line 500: + +1. Call `insert(key, value)` on `BTreeMap` +2. If `root_addr == NULL`: allocate a new leaf node via `allocator.allocate()`, save an empty `Node`, set `root_addr` +3. Otherwise: load the root `Node` from memory (`Node::load` reads the node header to detect V1/V2, then deserializes keys, values, and children addresses) +4. Walk down the tree: at each internal node, binary-search entries to find the child pointer, load that child +5. At the leaf: insert the `(key, value)` entry, keeping entries sorted by key +6. If the leaf now has `CAPACITY` (11) entries: split — allocate a new node, distribute entries, push the median key up to the parent +7. Splits propagate up; if the root splits, a new root node is allocated +8. Every modified node is saved: `Node::save` serializes back to raw bytes in the allocated chunk via `memory.write()` + +### 4f. V2 node serialization (`src/btreemap/node/v2.rs`) + +Initial page layout: + +``` +magic "BTN" ↕ 3 bytes +layout version (2) ↕ 1 byte +node type ↕ 1 byte +entry count (k) ↕ 2 bytes +overflow address ↕ 8 bytes +child addresses ↕ 8 bytes each, up to k + 1 +key blobs with 1/2/4-byte length prefix if not fixed-size +value blobs with 1/2/4-byte length prefix if not fixed-size +``` + +If the serialized content exceeds the page size, overflow pages are chained. Each overflow page has a `"NOF"` magic, a next-overflow pointer, and continuation data. + +### 4g. The Storable round-trip + +``` +save: value.into_bytes_checked() → write length prefix if needed → write bytes +load: read length prefix if needed → read bytes → V::from_bytes(bytes) +``` + +`insert` receives the value by value, so the library calls `into_bytes_checked()` rather than `to_bytes_checked()` — the value is consumed directly into a `Vec` with no intermediate clone for owned types. + +This is where `BOUND` matters in practice. A fixed-size key writes 0 bytes of overhead per entry. An unbounded value always writes a length prefix and can use as many bytes as needed. + +## Segment 5 — Local development loop (10 min) + +### Building and testing + +```sh +cargo build +cargo test # runs all unit and property tests +cargo test -p ic-stable-structures # library tests only +``` + +The `btreemap` module has property-based tests in `src/btreemap/proptests.rs` using `proptest`. They generate random sequences of inserts, removes, and gets and compare the stable `BTreeMap` against `std::collections::BTreeMap`. + +### Fuzzing (requires nightly) + +```sh +rustup toolchain install nightly +cargo install cargo-fuzz +cargo +nightly fuzz list +cargo +nightly fuzz run stable_btreemap_multiple_ops_persistent +``` + +The fuzz targets in `fuzz/fuzz_targets/` run random multi-operation sequences and check for crashes and invariant violations. The `_persistent` variants reuse a single in-memory structure across iterations, which finds bugs from state accumulation. + +### Benchmarks + +Benchmarks use `canbench-rs` (`benchmarks/btreemap/src/main.rs`). They measure **instruction counts**, not wall time, because instructions are the actual cost unit on the IC. + +```sh +cargo install canbench +cd benchmarks/btreemap && canbench +``` + +Key benchmark families: + +- `btreemap_insert_blob_*` — inserts with different key/value sizes (4 to 1024 bytes) +- `btreemap_get_blob_*` — random lookups at scale +- `btreemap_iter_*` — full iteration cost + +### Checklist for new contributions + +1. Write unit tests inside the module (see `src/btreemap/node/tests.rs` as a model) +2. Add or extend a proptest suite to catch invariant violations +3. Add a benchmark if the change affects hot paths +4. If the on-disk format changes, bump the version constant and add a load path for the old version — **never break backward compatibility** + +## Key files to bookmark + +| File | What's there | +|---|---| +| `src/lib.rs` | `Memory` trait, `safe_write`, `RestrictedMemory` | +| `src/storable.rs` | `Storable` trait, `Bound` enum, primitive impls | +| `src/memory_manager.rs` | `MemoryManager`, `VirtualMemory`, bucket layout | +| `src/btreemap.rs` | `BTreeMap` header, `init`, `insert`, `get`, `remove` | +| `src/btreemap/allocator.rs` | Chunk allocator | +| `src/btreemap/node/v2.rs` | Node V2 on-disk format | +| `examples/src/basic_example/src/lib.rs` | Minimal canister template | +| `docs/src/schema-upgrades.md` | How to evolve types safely | +| `benchmarks/btreemap/src/main.rs` | Benchmark structure to copy | From ab10c72dec2a36dd0bd010ea3f7ad42bc693224f Mon Sep 17 00:00:00 2001 From: Maksym Arutyunyan Date: Mon, 23 Mar 2026 15:01:56 +0100 Subject: [PATCH 2/6] . --- docs/src/introduction/deep-dive.md | 153 +++++++++-------------------- 1 file changed, 44 insertions(+), 109 deletions(-) diff --git a/docs/src/introduction/deep-dive.md b/docs/src/introduction/deep-dive.md index 0b17c3a4..f2911eb2 100644 --- a/docs/src/introduction/deep-dive.md +++ b/docs/src/introduction/deep-dive.md @@ -184,8 +184,10 @@ Always use `init` in a canister. Because stable memory survives upgrades, callin `BTreeMap` is currently the only structure that has shipped two layout versions, and its migration is a concrete example of the backward-compatibility principle above: -- **V1** supports only `Bound::Bounded` types. Page size is derived from `max_key_size` and `max_value_size` stored in the header. -- **V2** adds support for `Bound::Unbounded` types via explicit page sizes and overflow pages — nodes can chain multiple pages when a value exceeds the page size. +The term "node page" here refers to the fixed-size byte buffer the internal allocator assigns to each B-tree node — not a Wasm page (64 KiB) and not a MemoryManager bucket. + +- **V1** supports only `Bound::Bounded` types. The node page size is derived at load time from `max_key_size` and `max_value_size` stored in the header, so it is implicit rather than stored explicitly. +- **V2** adds support for `Bound::Unbounded` types by storing the node page size explicitly in the header and introducing overflow pages — when a node's data exceeds one page, it chains additional pages. Migration from V1 to V2 is **transparent and non-breaking**: calling `BTreeMap::init()` on an existing V1 map automatically upgrades it to V2 on first load — existing data is preserved, no user action required. Under the hood, `init()` calls `load_helper(memory, migrate_to_v2: true)`, which re-interprets the stored `max_key_size`/`max_value_size` as a `DerivedPageSize` for the V2 allocator. Loading a V2 map as V1 is rejected at startup. @@ -241,151 +243,84 @@ The redesign also removes the current 32,768-bucket cap (`MAX_NUM_BUCKETS`), rai The tradeoff: this is a **breaking change**. The on-disk header format is incompatible with the current layout, so existing canisters will need a one-time migration when the new MemoryManager ships. -## Segment 4 — Code walkthrough: BTreeMap::insert (25 min) - -`StableBTreeMap` is the most complex and most commonly used structure. Walking an insert from the public API down to raw byte writes shows every major mechanism. - -### 4a. File layout - -| File | Purpose | -|---|---| -| `src/btreemap.rs` | Public API, header format, `init`/`load`, `insert`/`get`/`remove` | -| `src/btreemap/allocator.rs` | Free-list chunk allocator for node pages | -| `src/btreemap/node.rs` | In-memory `Node` representation, load/save dispatch | -| `src/btreemap/node/v1.rs` | Node serialization for bounded types (legacy) | -| `src/btreemap/node/v2.rs` | Node serialization for bounded and unbounded types | -| `src/btreemap/iter.rs` | Range iteration | - -### 4b. Memory layout on disk (`src/btreemap.rs:1-50`) - -Address 0 in the memory given to `BTreeMap`: +## Segment 4 — StableBTreeMap (25 min) -``` -"BTR" magic ↕ 3 bytes -layout version ↕ 1 byte (1 = V1, 2 = V2) -max_key_size or page_size ↕ 4 bytes -max_value_size or marker ↕ 4 bytes -root node address ↕ 8 bytes -length / element count ↕ 8 bytes -reserved ↕ 24 bytes -Allocator header ↕ starts at offset 52 (ALLOCATOR_OFFSET) -Node pages ... -``` - -### 4c. The Allocator (`src/btreemap/allocator.rs`) - -The allocator is a free-list of same-size chunks. The chunk size is determined at `BTreeMap::new()` time: +`StableBTreeMap` is the most commonly used structure in this library, and its design is a direct response to the IC's per-round instruction limits. -- If both `K` and `V` are `Bounded`: `page_size = max_node_size * 3/4` (covers ~70% of real-world nodes at 8/11 capacity) -- Otherwise: `page_size = 1024` bytes (`DEFAULT_PAGE_SIZE`) +A hash map must rehash — copying all entries — when it grows, which is prohibitive at scale. A red-black tree avoids bulk copies but stores one key per node, so a lookup requires many scattered reads. A B-tree avoids both: it stores multiple keys per node in a single contiguous chunk, so each read fetches an entire node at once, and growth allocates exactly one new node at a time. -Each chunk on the free list contains a `ChunkHeader` (magic `"CHK"`, next-free pointer) followed by the usable data area. Allocation pops from the head; deallocation pushes back. +The remaining challenge is fragmentation: B-tree splits, merges, and deletes free nodes at arbitrary positions, leaving holes. The internal free-list allocator reclaims those holes immediately, so stable memory stays compact and every byte is either actively used or available for the next allocation. -### 4d. BTreeMap::init (`src/btreemap.rs:274-290`) +### 4a. How BTreeMap works -``` -if memory.size() == 0 → BTreeMap::new (writes a fresh header) -else → check for "BTR" magic and call BTreeMap::load -``` +A `BTreeMap` is a tree of fixed-size nodes, each holding up to 11 key-value entries sorted by key. Lookups and inserts walk from the root down to a leaf, binary-searching within each node. Splits and merges keep the tree balanced. -`BTreeMap::load` reads the header, detects the layout version (V1 or V2), and can transparently migrate V1 to V2 on first load. +Each node is stored as a contiguous byte chunk allocated by the internal free-list allocator. Only the nodes touched by an operation are read or written — the rest of the tree is never loaded. -### 4e. BTreeMap::insert — the critical path +### 4b. Performance-critical design decisions -Trace in `src/btreemap.rs` around line 500: +Because every read and write costs instructions, several optimizations keep the per-operation cost low: -1. Call `insert(key, value)` on `BTreeMap` -2. If `root_addr == NULL`: allocate a new leaf node via `allocator.allocate()`, save an empty `Node`, set `root_addr` -3. Otherwise: load the root `Node` from memory (`Node::load` reads the node header to detect V1/V2, then deserializes keys, values, and children addresses) -4. Walk down the tree: at each internal node, binary-search entries to find the child pointer, load that child -5. At the leaf: insert the `(key, value)` entry, keeping entries sorted by key -6. If the leaf now has `CAPACITY` (11) entries: split — allocate a new node, distribute entries, push the median key up to the parent -7. Splits propagate up; if the root splits, a new root node is allocated -8. Every modified node is saved: `Node::save` serializes back to raw bytes in the allocated chunk via `memory.write()` +**Lazy key and value loading** (`src/btreemap/node.rs`) — each entry holds a `LazyObject`: either an already-decoded value or an `(offset, size)` reference into the node's raw bytes, resolved on first access via `OnceCell`. Values are always deferred — they are never touched during a tree traversal. -### 4f. V2 node serialization (`src/btreemap/node/v2.rs`) +For keys, the strategy depends on size: small key bytes are decoded eagerly on node load (cheaper than storing a reference for tiny payloads), while larger keys are kept as byte references and decoded only when the binary search actually reaches them. -Initial page layout: +**Zero-copy writes** — `insert` receives the value by value and calls `into_bytes()` rather than `to_bytes()`. For types whose serialized form is their internal buffer (e.g. `Vec`, `String`), this moves the buffer directly into the write path with no allocation. -``` -magic "BTN" ↕ 3 bytes -layout version (2) ↕ 1 byte -node type ↕ 1 byte -entry count (k) ↕ 2 bytes -overflow address ↕ 8 bytes -child addresses ↕ 8 bytes each, up to k + 1 -key blobs with 1/2/4-byte length prefix if not fixed-size -value blobs with 1/2/4-byte length prefix if not fixed-size -``` +**Lazy range iteration** (`src/btreemap/iter.rs`) — the iterator advances one entry at a time. Values are only decoded when the caller actually dereferences the iterator, so ranging over keys without touching values incurs no deserialization cost. -If the serialized content exceeds the page size, overflow pages are chained. Each overflow page has a `"NOF"` magic, a next-overflow pointer, and continuation data. +### 4c. Key files -### 4g. The Storable round-trip - -``` -save: value.into_bytes_checked() → write length prefix if needed → write bytes -load: read length prefix if needed → read bytes → V::from_bytes(bytes) -``` - -`insert` receives the value by value, so the library calls `into_bytes_checked()` rather than `to_bytes_checked()` — the value is consumed directly into a `Vec` with no intermediate clone for owned types. - -This is where `BOUND` matters in practice. A fixed-size key writes 0 bytes of overhead per entry. An unbounded value always writes a length prefix and can use as many bytes as needed. +| File | Purpose | +|---|---| +| `src/btreemap.rs` | Public API, header, `init`/`insert`/`get`/`remove` | +| `src/btreemap/allocator.rs` | Free-list chunk allocator | +| `src/btreemap/node.rs` | In-memory `Node`, lazy entry loading | +| `src/btreemap/node/v2.rs` | Node serialization (current format) | +| `src/btreemap/iter.rs` | Lazy range iteration | ## Segment 5 — Local development loop (10 min) -### Building and testing +### Testing ```sh -cargo build -cargo test # runs all unit and property tests -cargo test -p ic-stable-structures # library tests only +cargo test ``` -The `btreemap` module has property-based tests in `src/btreemap/proptests.rs` using `proptest`. They generate random sequences of inserts, removes, and gets and compare the stable `BTreeMap` against `std::collections::BTreeMap`. +Tests fall into two categories: + +**Unit tests** live inside each module and check specific behaviors. See `src/btreemap/node/tests.rs` as a model. + +**Property-based tests** (`src/btreemap/proptests.rs`) use `proptest` to generate random sequences of inserts, removes, and gets, then verify results against `std::collections::BTreeMap`. This is the primary correctness check — if a stable structure diverges from the standard library equivalent under any sequence of operations, the test fails. Running `cargo test` covers both. ### Fuzzing (requires nightly) ```sh -rustup toolchain install nightly -cargo install cargo-fuzz -cargo +nightly fuzz list cargo +nightly fuzz run stable_btreemap_multiple_ops_persistent ``` -The fuzz targets in `fuzz/fuzz_targets/` run random multi-operation sequences and check for crashes and invariant violations. The `_persistent` variants reuse a single in-memory structure across iterations, which finds bugs from state accumulation. +Fuzz targets in `fuzz/fuzz_targets/` run random operation sequences and check for crashes and invariant violations. The `_persistent` variants reuse a single structure across iterations, which is effective at finding bugs that only appear after accumulated state changes. -### Benchmarks +### Benchmarks and CI regression checks -Benchmarks use `canbench-rs` (`benchmarks/btreemap/src/main.rs`). They measure **instruction counts**, not wall time, because instructions are the actual cost unit on the IC. +Benchmarks measure **instruction counts**, not wall time — instructions are the actual cost unit on the IC. Benchmarks exist for all performance-critical structures (`benchmarks/btreemap`, `btreeset`, `vec`, `memory-manager`, `nns`, `io_chunks`). ```sh cargo install canbench cd benchmarks/btreemap && canbench ``` -Key benchmark families: +Every PR runs all benchmarks in CI and compares results against the `main` branch baseline. If any benchmark regresses or improves, **the CI job fails** until the results are explicitly acknowledged: + +```sh +canbench --persist # update canbench_results.yml with new baseline +``` -- `btreemap_insert_blob_*` — inserts with different key/value sizes (4 to 1024 bytes) -- `btreemap_get_blob_*` — random lookups at scale -- `btreemap_iter_*` — full iteration cost +This means `canbench_results.yml` in each benchmark directory is a committed, reviewed record of expected performance. Any change to a hot path must either stay within the existing baseline or ship with an updated `canbench_results.yml` that explains the change. ### Checklist for new contributions -1. Write unit tests inside the module (see `src/btreemap/node/tests.rs` as a model) -2. Add or extend a proptest suite to catch invariant violations -3. Add a benchmark if the change affects hot paths +1. Add unit tests inside the module +2. Add or extend a proptest suite if the change affects insert/get/remove behavior +3. Run benchmarks locally and update `canbench_results.yml` if instruction counts change 4. If the on-disk format changes, bump the version constant and add a load path for the old version — **never break backward compatibility** - -## Key files to bookmark - -| File | What's there | -|---|---| -| `src/lib.rs` | `Memory` trait, `safe_write`, `RestrictedMemory` | -| `src/storable.rs` | `Storable` trait, `Bound` enum, primitive impls | -| `src/memory_manager.rs` | `MemoryManager`, `VirtualMemory`, bucket layout | -| `src/btreemap.rs` | `BTreeMap` header, `init`, `insert`, `get`, `remove` | -| `src/btreemap/allocator.rs` | Chunk allocator | -| `src/btreemap/node/v2.rs` | Node V2 on-disk format | -| `examples/src/basic_example/src/lib.rs` | Minimal canister template | -| `docs/src/schema-upgrades.md` | How to evolve types safely | -| `benchmarks/btreemap/src/main.rs` | Benchmark structure to copy | From 2753a186ccbd4d3bae9ffce677b45305135e579a Mon Sep 17 00:00:00 2001 From: Maksym Arutyunyan Date: Mon, 23 Mar 2026 15:21:41 +0100 Subject: [PATCH 3/6] . --- docs/src/introduction/deep-dive.md | 62 ++++++++++++++---------------- 1 file changed, 28 insertions(+), 34 deletions(-) diff --git a/docs/src/introduction/deep-dive.md b/docs/src/introduction/deep-dive.md index f2911eb2..7bfaaead 100644 --- a/docs/src/introduction/deep-dive.md +++ b/docs/src/introduction/deep-dive.md @@ -1,6 +1,8 @@ # Stable Structures Deep Dive -## Segment 1 — Why this library exists (10 min) +This document is for contributors who want to work on the `stable-structures` library itself. It covers the design reasoning, internal architecture, and implementation patterns that are not visible from the public API — the kind of context you need before making a meaningful change. It is not a usage guide; for that, see the [README](../../../README.md). + +## Background and Motivation The Internet Computer (IC) runs canister smart contracts. When a canister is upgraded, its heap is wiped. The conventional fix is to serialize all state to stable memory in a `pre_upgrade` hook and deserialize it in `post_upgrade`. This works for small state but does not scale: the serialization itself costs cycles, and a bug in either hook can make the canister permanently non-upgradable. @@ -26,11 +28,11 @@ The six structures the library ships: | `Log` | Append-only variable-size entries | Bounded + Unbounded | 2 (index + data) | | `MinHeap` | Priority queue | Bounded only | 1 | -## Segment 2 — Core abstractions (15 min) +## Core Abstractions -### 2a. The Memory trait (`src/lib.rs:52-93`) +### The Memory Trait -Everything in the library is generic over a single four-method trait: +Everything in the library is generic over a single four-method trait (`src/lib.rs:52-93`): ```rust /// Abstraction over a WebAssembly-style linear memory. @@ -56,11 +58,11 @@ Concrete implementations: - `FileMemory` — file-backed memory using standard file I/O, useful for offline development and persistence - `DefaultMemoryImpl` — resolves to `Ic0StableMemory` in `wasm32`, `VectorMemory` otherwise -`RestrictedMemory` (`src/lib.rs:243-308`) is a public `Memory` adapter that exposes a fixed page range of a larger memory as its own address space starting at 0. It is the simpler alternative to `MemoryManager` for cases where each structure's maximum size is known upfront — covered in more detail in section 2c. +`RestrictedMemory` (`src/lib.rs:243-308`) is a public `Memory` adapter that exposes a fixed page range of a larger memory as its own address space starting at 0. It is the simpler alternative to `MemoryManager` for cases where each structure's maximum size is known upfront — covered in the MemoryManager section below. -### 2b. The Storable trait (`src/storable.rs:13-72`) +### The Storable Trait -Stable structures are generic and work only with raw bytes — they have no knowledge of the types stored in them. `Storable` is the bridge: it tells a structure how to convert a value to and from bytes. Any type you want to store must implement it. The library already provides implementations for the most common types; custom types require a manual implementation: +Stable structures are generic and work only with raw bytes — they have no knowledge of the types stored in them. `Storable` (`src/storable.rs:13-72`) is the bridge: it tells a structure how to convert a value to and from bytes. Any type you want to store must implement it. The library already provides implementations for the most common types; custom types require a manual implementation: ```rust pub trait Storable { @@ -83,11 +85,11 @@ The `BOUND` constant is the key design decision a user must make: - **`Bound::Unbounded`** — no size constraints; the structure stores a length prefix before each value. Safest default for types with `String`s or `Vec`s. - **`Bound::Bounded { max_size: u32, is_fixed_size: bool }`** — `max_size` is enforced at runtime via `to_bytes_checked()`. Setting `is_fixed_size: true` eliminates the length prefix, saving bytes per entry. **You cannot increase `max_size` after deployment without corrupting data.** -The library ships `Storable` implementations for all primitives (`u8` through `u128`, `f32`/`f64`, `bool`, `[u8; N]`), `String`, `Vec`, `Principal`, `Option`, and tuples. +The library ships `Storable` implementations for all primitives (`u8` through `u128`, `f32`/`f64`, `bool`), `[u8; N]`, `Blob` (a fixed-size byte array wrapper type), `String`, `Vec`, `Principal`, `Option`, and tuples. Note: `Storable` says nothing about the serialization format. Users commonly use CBOR (`ciborium`), protobuf, or Candid inside `to_bytes`/`from_bytes`. See `docs/src/schema-upgrades.md` for patterns for adding fields safely. -### 2c. The MemoryManager (`src/memory_manager.rs`) +### The MemoryManager Each stable structure requires exclusive ownership of its memory — sharing causes corruption. The naive alternative, carving stable memory into static regions via `RestrictedMemory`, has two problems: you must know the size limit upfront, and the full region is paid for even when mostly empty. @@ -146,7 +148,7 @@ thread_local! { See `examples/src/basic_example/src/lib.rs` for the minimal working canister. -### 2d. Internal allocators (contributor-level) +### Internal Allocators Memory allocation is invisible to users of stable-structures — structures interact only through the `Memory` trait and have no way to express "free this region." It is only relevant when working on the internals of a specific structure, where the choice of allocation strategy directly shapes the implementation. @@ -169,9 +171,9 @@ The key question that drives each structure's strategy: can holes appear in its The allocator divides remaining memory into equal-size chunks. Free chunks form a singly-linked list: allocation pops the head, deallocation pushes back onto it — both O(1). When the free list is empty, the memory grows and a new chunk is appended. -## Segment 3 — Lifecycle, schema upgrades, and migrations (10 min) +## Lifecycle, Schema Upgrades, and Migrations -### 3a. Lifecycle across upgrades +### Lifecycle Across Upgrades Every stable structure has two constructors: @@ -180,18 +182,16 @@ Every stable structure has two constructors: Always use `init` in a canister. Because stable memory survives upgrades, calling `init` on next deployment finds the existing data and resumes from it — no `pre_upgrade`/`post_upgrade` needed. The magic header also carries a layout version, so new library versions can always read data written by older ones. -### 3b. Layout versioning in practice: BTreeMap V1 → V2 - -`BTreeMap` is currently the only structure that has shipped two layout versions, and its migration is a concrete example of the backward-compatibility principle above: +### Layout Versioning: BTreeMap V1 → V2 -The term "node page" here refers to the fixed-size byte buffer the internal allocator assigns to each B-tree node — not a Wasm page (64 KiB) and not a MemoryManager bucket. +`BTreeMap` is currently the only structure that has shipped two layout versions. Each version uses "node pages" — the fixed-size byte buffers the internal allocator assigns to each B-tree node (distinct from Wasm pages and MemoryManager buckets): - **V1** supports only `Bound::Bounded` types. The node page size is derived at load time from `max_key_size` and `max_value_size` stored in the header, so it is implicit rather than stored explicitly. - **V2** adds support for `Bound::Unbounded` types by storing the node page size explicitly in the header and introducing overflow pages — when a node's data exceeds one page, it chains additional pages. -Migration from V1 to V2 is **transparent and non-breaking**: calling `BTreeMap::init()` on an existing V1 map automatically upgrades it to V2 on first load — existing data is preserved, no user action required. Under the hood, `init()` calls `load_helper(memory, migrate_to_v2: true)`, which re-interprets the stored `max_key_size`/`max_value_size` as a `DerivedPageSize` for the V2 allocator. Loading a V2 map as V1 is rejected at startup. +Migration from V1 to V2 is **transparent and non-breaking**: calling `BTreeMap::init()` on an existing V1 map automatically upgrades it to V2 on first load — existing data is preserved, no user action required. Any unrecognized layout version causes a panic at startup. -### 3c. Schema upgrades +### Schema Upgrades Stable structures don't enforce a serialization format. The recommended pattern for evolving types is to use a flexible format (e.g. CBOR via `ciborium`) and `Bound::Unbounded`: @@ -208,7 +208,7 @@ Adding a field is then safe with `#[serde(default)]` — old records decode with **Warning:** if you used `Bound::Bounded`, never increase `max_size` after deployment — existing node pages were sized to the old value and enlarging it corrupts them. Migrating from `Bounded` to `Unbounded` is safe; the reverse is not. -### 3d. Data migrations +### Data Migrations When an in-place field addition isn't enough — e.g. changing the key type or restructuring the value layout entirely — data must be migrated from one structure to another. @@ -225,7 +225,7 @@ This is the pattern NNS-dapp used when migrating accounts to stable memory: two The cost to be aware of: both structures occupy stable memory simultaneously (~2× peak usage), and after the old structure is cleared its buckets remain permanently assigned to its `MemoryId` — they cannot be reclaimed or reused. Budget for this when planning large schema changes. -### 3e. MemoryManager limitations and the path to bucket reclamation +### MemoryManager Limitations and Bucket Reclamation The inability to reuse freed buckets is a known limitation rooted in a structural invariant of the current MemoryManager: **buckets within each virtual memory must be stored in ascending order by ID**. Because the header encodes only the owning `MemoryId` per bucket (1 byte), there is no room to store an explicit ordering — the order is implied by bucket position. To load a virtual memory correctly, the runtime simply scans for all buckets belonging to that `MemoryId` and traverses them in ID order. @@ -243,7 +243,7 @@ The redesign also removes the current 32,768-bucket cap (`MAX_NUM_BUCKETS`), rai The tradeoff: this is a **breaking change**. The on-disk header format is incompatible with the current layout, so existing canisters will need a one-time migration when the new MemoryManager ships. -## Segment 4 — StableBTreeMap (25 min) +## StableBTreeMap Internals `StableBTreeMap` is the most commonly used structure in this library, and its design is a direct response to the IC's per-round instruction limits. @@ -251,35 +251,36 @@ A hash map must rehash — copying all entries — when it grows, which is prohi The remaining challenge is fragmentation: B-tree splits, merges, and deletes free nodes at arbitrary positions, leaving holes. The internal free-list allocator reclaims those holes immediately, so stable memory stays compact and every byte is either actively used or available for the next allocation. -### 4a. How BTreeMap works +### How BTreeMap Works A `BTreeMap` is a tree of fixed-size nodes, each holding up to 11 key-value entries sorted by key. Lookups and inserts walk from the root down to a leaf, binary-searching within each node. Splits and merges keep the tree balanced. Each node is stored as a contiguous byte chunk allocated by the internal free-list allocator. Only the nodes touched by an operation are read or written — the rest of the tree is never loaded. -### 4b. Performance-critical design decisions +### Performance-Critical Design Decisions Because every read and write costs instructions, several optimizations keep the per-operation cost low: **Lazy key and value loading** (`src/btreemap/node.rs`) — each entry holds a `LazyObject`: either an already-decoded value or an `(offset, size)` reference into the node's raw bytes, resolved on first access via `OnceCell`. Values are always deferred — they are never touched during a tree traversal. -For keys, the strategy depends on size: small key bytes are decoded eagerly on node load (cheaper than storing a reference for tiny payloads), while larger keys are kept as byte references and decoded only when the binary search actually reaches them. +For keys, the strategy depends on size: keys ≤ 16 bytes are decoded eagerly on node load (cheaper than storing a reference for tiny payloads), while larger keys are kept as byte references and decoded only when the binary search actually reaches them. -**Zero-copy writes** — `insert` receives the value by value and calls `into_bytes()` rather than `to_bytes()`. For types whose serialized form is their internal buffer (e.g. `Vec`, `String`), this moves the buffer directly into the write path with no allocation. +**Zero-copy writes** — `insert` calls `into_bytes()` rather than `to_bytes()`, moving the value's buffer directly into the write path for types like `Vec` and `String` with no extra allocation. (See the Storable Trait section for why the trait has both methods.) **Lazy range iteration** (`src/btreemap/iter.rs`) — the iterator advances one entry at a time. Values are only decoded when the caller actually dereferences the iterator, so ranging over keys without touching values incurs no deserialization cost. -### 4c. Key files +### Key Files | File | Purpose | |---|---| | `src/btreemap.rs` | Public API, header, `init`/`insert`/`get`/`remove` | | `src/btreemap/allocator.rs` | Free-list chunk allocator | | `src/btreemap/node.rs` | In-memory `Node`, lazy entry loading | +| `src/btreemap/node/v1.rs` | Node serialization (old format) | | `src/btreemap/node/v2.rs` | Node serialization (current format) | | `src/btreemap/iter.rs` | Lazy range iteration | -## Segment 5 — Local development loop (10 min) +## Contributor Development Loop ### Testing @@ -317,10 +318,3 @@ canbench --persist # update canbench_results.yml with new baseline ``` This means `canbench_results.yml` in each benchmark directory is a committed, reviewed record of expected performance. Any change to a hot path must either stay within the existing baseline or ship with an updated `canbench_results.yml` that explains the change. - -### Checklist for new contributions - -1. Add unit tests inside the module -2. Add or extend a proptest suite if the change affects insert/get/remove behavior -3. Run benchmarks locally and update `canbench_results.yml` if instruction counts change -4. If the on-disk format changes, bump the version constant and add a load path for the old version — **never break backward compatibility** From 87dc5ed38568c2781c39fc53ccd1eeb4b2d5cbeb Mon Sep 17 00:00:00 2001 From: Maksym Arutyunyan Date: Mon, 23 Mar 2026 15:33:16 +0100 Subject: [PATCH 4/6] . --- docs/src/introduction/deep-dive.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/docs/src/introduction/deep-dive.md b/docs/src/introduction/deep-dive.md index 7bfaaead..591aec2b 100644 --- a/docs/src/introduction/deep-dive.md +++ b/docs/src/introduction/deep-dive.md @@ -32,7 +32,12 @@ The six structures the library ships: ### The Memory Trait -Everything in the library is generic over a single four-method trait (`src/lib.rs:52-93`): +The `Memory` trait is the decoupling layer at the heart of the library. +Every stable structure is generic over it, so any structure works unchanged with IC stable memory on-chain, `VectorMemory` in tests, or `FileMemory` locally — with no code changes to the structure itself (`src/lib.rs:52-93`). + +The four methods deliberately mirror the WebAssembly linear memory API. +One thing is notably absent: there is no `free` or `shrink`. +WebAssembly memory can only grow, and this constraint propagates through the entire library — every design decision around memory reuse traces back to it. ```rust /// Abstraction over a WebAssembly-style linear memory. @@ -235,7 +240,7 @@ This makes safe reuse impossible in the typical migration layout. When structure A partial fix, **conservative bucket reuse**, was implemented and then [reverted](https://github.com/dfinity/stable-structures/pull/396). It allowed reuse only of freed buckets with IDs *higher* than the growing virtual memory's current maximum — a constraint that is almost never satisfied in practice, since A allocates first and therefore always has lower IDs than B. -#### Alternative design: explicit linked list of buckets +#### Alternative design: explicit linked list of buckets (not implemented) The proper solution requires a new header layout. The alternative design replaces implicit ID ordering with an **explicit linked list**: each bucket stores a 4-byte pointer to the next bucket in its virtual memory's chain. Freeing a virtual memory then simply nulls out its head pointer, making all its buckets immediately available for any new allocation regardless of their IDs. @@ -261,7 +266,7 @@ Each node is stored as a contiguous byte chunk allocated by the internal free-li Because every read and write costs instructions, several optimizations keep the per-operation cost low: -**Lazy key and value loading** (`src/btreemap/node.rs`) — each entry holds a `LazyObject`: either an already-decoded value or an `(offset, size)` reference into the node's raw bytes, resolved on first access via `OnceCell`. Values are always deferred — they are never touched during a tree traversal. +**Lazy key and value loading** (`src/btreemap/node.rs`) — each entry holds a `LazyObject`: either an already-decoded value or an `(offset, size)` reference into the node's raw bytes, resolved on first access via `OnceCell`. Values are always deferred — they are never touched during a tree traversal. For keys, the strategy depends on size: keys ≤ 16 bytes are decoded eagerly on node load (cheaper than storing a reference for tiny payloads), while larger keys are kept as byte references and decoded only when the binary search actually reaches them. From 1912d0a30731fe95802ff2f280c5fecd94a7f95e Mon Sep 17 00:00:00 2001 From: Maksym Arutyunyan Date: Mon, 23 Mar 2026 16:40:31 +0100 Subject: [PATCH 5/6] . --- docs/src/introduction/deep-dive.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/src/introduction/deep-dive.md b/docs/src/introduction/deep-dive.md index 591aec2b..d9c78651 100644 --- a/docs/src/introduction/deep-dive.md +++ b/docs/src/introduction/deep-dive.md @@ -101,7 +101,8 @@ Each stable structure requires exclusive ownership of its memory — sharing cau `MemoryManager` eliminates both problems. It presents each structure with a `VirtualMemory` that has no upfront size limit and grows on demand. Underneath, it divides the real stable memory into 128-page buckets allocated as needed and interleaved freely across virtual memories — so total stable memory usage stays proportional to actual data, not declared limits. ``` -1) NAIVE (RestrictedMemory) — limits declared upfront, full region allocated immediately +1) NAIVE (RestrictedMemory) + limits declared upfront, full region allocated immediately Stable memory ┌──────────────────────────────┬──────────────────────────────┐ From 6928af3285fb509222d8e4cc68693e3e40dcfdd5 Mon Sep 17 00:00:00 2001 From: Maksym Arutyunyan Date: Tue, 24 Mar 2026 10:57:38 +0100 Subject: [PATCH 6/6] . --- docs/src/introduction/deep-dive.md | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/docs/src/introduction/deep-dive.md b/docs/src/introduction/deep-dive.md index d9c78651..6508549b 100644 --- a/docs/src/introduction/deep-dive.md +++ b/docs/src/introduction/deep-dive.md @@ -17,16 +17,17 @@ The Internet Computer (IC) runs canister smart contracts. When a canister is upg - **No reallocation** — moving data in bulk is too expensive in cycles; all growth happens in place - **Multi-memory compatibility** — the design works with multiple stable memories, ensuring forward compatibility with upcoming IC features -The six structures the library ships: - -| Structure | Description | Type support | Memories needed | -|------------|-----------------------------------|---------------------|------------------| -| `Cell` | Single serializable value | Bounded + Unbounded | 1 | -| `BTreeMap` | Ordered key-value store | Bounded + Unbounded | 1 | -| `BTreeSet` | Ordered set of unique keys | Bounded + Unbounded | 1 | -| `Vec` | Growable array | Bounded only | 1 | -| `Log` | Append-only variable-size entries | Bounded + Unbounded | 2 (index + data) | -| `MinHeap` | Priority queue | Bounded only | 1 | +The structures library ships: + +| Structure | Description | Container Type | Memories needed | +|-----------------|-----------------------------------|---------------------|------------------| +| `Cell` | Single serializable value | Bounded + Unbounded | 1 | +| `BTreeMap` | Ordered key-value store | Bounded + Unbounded | 1 | +| `BTreeSet` | Ordered set of unique keys | Bounded + Unbounded | 1 | +| `Vec` | Growable array | Bounded only | 1 | +| `Log` | Append-only variable-size entries | Bounded + Unbounded | 2 (index + data) | +| `MinHeap` | Priority queue | Bounded only | 1 | +| `MemoryManager` | Manages on-demand virtual memory | n/a | 1 | ## Core Abstractions