Skip to content

perf: use autoresearch to reduce allocations#3225

Draft
tac0turtle wants to merge 10 commits intomainfrom
autoresearch/block-perf-2026-04-04
Draft

perf: use autoresearch to reduce allocations#3225
tac0turtle wants to merge 10 commits intomainfrom
autoresearch/block-perf-2026-04-04

Conversation

@tac0turtle
Copy link
Copy Markdown
Contributor

@tac0turtle tac0turtle commented Apr 4, 2026

Overview

This pr used autoresearch to run a loop on reducing allocations

3 a8b6352 78allocs 22,996 33,091 keep Unsafe reinterpret cast of Txs to [][]byte in ApplyBlock — eliminates make([][]byte, n) allocation
4 823aa62 77allocs 20,276 32,480 keep Direct pb.Data serialization in DACommitment — avoids pruned Data wrapper and txsToByteSlices allocations
5 0720b44 74allocs 12,192 31,624 keep unsafe.Slice in Data.ToProto() — eliminates txsToByteSlices [][]byte allocation
6 — 74allocs 12,187 31,217 discard Reverted hand-written HashSlim wire encoder — produced different hashes than MarshalBinary
7 ccbc2e4 64allocs 11,130 31,570 keep sync.Pool for protobuf message structs in MarshalBinary — eliminates 10 allocs per block
8 805672e 56allocs 10,217 31,037 keep Pooled SignedHeader.MarshalBinary — reuse pb.SignedHeader/pb.Header/pb.Signer/pb.Version structs

Result: {"status":"keep","allocs_per_op":81,"bytes_per_op":25934,"ns_per_op":34001}
…allocations per block

Result: {"status":"keep","allocs_per_op":79,"bytes_per_op":25697,"ns_per_op":34147}
… make([][]byte, n) allocation

Result: {"status":"keep","allocs_per_op":78,"bytes_per_op":22996,"ns_per_op":33091}
…pper and txsToByteSlices allocations

Result: {"status":"keep","allocs_per_op":77,"bytes_per_op":20276,"ns_per_op":32480}
…allocation

Result: {"status":"keep","allocs_per_op":74,"bytes_per_op":12192,"ns_per_op":31624}
…10 allocs per block

Replace per-call allocation of pb.Header/pb.Version/pb.Data/pb.Metadata with
sync.Pool reuse in the hot MarshalBinary path. ToProto() API is unchanged —
only MarshalBinary is affected since it consumes the result immediately.

Metrics (100_txs benchmark):
- 74 → 64 allocs/op (-13.5%)
- ~12.1 → ~11.1 KB (-8.3%)
- ~31ns flat
…Signer/pb.Version

Eliminates 4 struct allocations per signed header marshal: pb.SignedHeader,
pb.Header, pb.Version, pb.Signer. These are now borrowed from sync.Pool and
returned after proto.Marshal completes.

Metrics (100_txs benchmark):
- 64 → 56 allocs/op
- ~11KB → ~10.2KB
…er block

State.ToProto allocated pb.State + pb.Version + timestamppb.Timestamp per block.
MarshalBinary now pools those structs and returns the marshaled bytes directly.
pkg/store/batch.UpdateState switched from ToProto+proto.Marshal to MarshalBinary.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 4, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d5807685-ac5f-4aba-859b-620dacdf7424

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch autoresearch/block-perf-2026-04-04

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedApr 4, 2026, 10:11 AM

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 4, 2026

Claude finished @tac0turtle's task in 3m 42s —— View job


PR Review: perf: use autoresearch to reduce allocations

  • Read changed files
  • Review code quality & architecture
  • Review performance & correctness
  • Review testing
  • Post final feedback

Summary

This PR introduces sync.Pool for protobuf message structs and unsafe.Slice reinterpretation of []Tx → [][]byte across several hot serialization paths. The approach is sound, well-commented, and delivers real allocation reductions per block. A few issues worth addressing before merge:


Issues

1. &pb.Signer{} allocation in SignedHeader.MarshalBinary nil-pubkey path

types/serialization.go:247 — when sh.Signer.PubKey == nil, the code allocates a fresh &pb.Signer{} instead of using pbSignerPool:

if sh.Signer.PubKey == nil {
    psh.Signer = &pb.Signer{}   // ← new allocation, inconsistent with pool strategy
    ...
}

This is the common case for nodes that haven't set a signer. Should use pbSignerPool.Get() / pbSignerPool.Put() here, just like the non-nil branch.

Fix this →

2. timestamppb.Timestamp not pooled in State.MarshalBinary

types/serialization.go:540-543:

pts := &timestamppb.Timestamp{
    Seconds: s.LastBlockTime.Unix(),
    Nanos:   int32(s.LastBlockTime.Nanosecond()),
}

This allocates a new Timestamp on every call despite the function comment saying "pooled protobuf messages". Either pool it or remove the claim from the docstring. Timestamp is a small allocation but leaving it as-is makes the comment misleading.

3. defer in Data.MarshalBinary — inconsistent cleanup style

types/serialization.go:135-138: All other functions in this file use explicit cleanup before return. The defer here is the only exception and adds a defer frame (minor cost on a hot path). Consider making it explicit like the rest:

// Instead of defer, call explicitly before return:
pm.Reset()
pbMetadataPool.Put(pm)

4. Missing test coverage for State.MarshalBinary

TestStateRoundTrip in serialization_test.go calls state.ToProto() + proto.Marshal() directly — it does not exercise the new State.MarshalBinary() path. The new method should have at least a round-trip test that calls s.MarshalBinary() then proto.Unmarshal() + s.FromProto() (or a dedicated UnmarshalBinary if it exists) with multiple state values, including zero state and a fully-populated state.


Observations / Praise

unsafe.Slice reinterpretation (types/serialization.go:142, hashing.go:135, executor.go:799): The Tx = []byte type alias means []Tx and [][]byte are layout-identical. The unsafe.Slice((*[]byte)(unsafe.SliceData(...)), n) pattern is idiomatic for this in Go and avoids a make + copy. All call sites correctly guard against nil and empty slices (or handle them via len == 0). Well done.

sha256Pool in hashing.go: leafHashOpt calls s.Reset() as its first operation, so pooled hashers returned in any state are safe to reuse. Correct.

Pool cleanup on all early-return paths: SignedHeader.MarshalBinary has two early returns (nil pubkey, marshalling error). Both correctly return all three acquired pool objects before returning. Good defensive code.

Pool objects are Reset before Put: Every Put is preceded by the corresponding Reset(), preventing stale byte-slice references in pooled objects from keeping GC roots alive. Correct.

Data.ToProto() inlining of Metadata.ToProto(): The promoted field d.LastDataHash correctly accesses d.Metadata.LastDataHash through the embedded *Metadata pointer. No semantic change.


Minor Nits

  • types/hash_memo_bench_test.go:13,28,40 — uses b.Loop() which is the Go 1.24+ benchmarking API. Ensure the project's minimum Go version supports this; otherwise use the classic for i := 0; i < b.N; i++ form.
  • BenchmarkHeaderHash_Single measures a fully cold (never-memoized) hash. The benchmark name could clarify it is "no memo, single call" vs BenchmarkHeaderHash_NoMemo which is "no memo, 3 calls". As-is the distinction is slightly confusing.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://evstack.github.io/docs-preview/pr-3225/

Built to branch main at 2026-04-04 10:13 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@tac0turtle tac0turtle force-pushed the autoresearch/block-perf-2026-04-04 branch from 9148ccc to 2438dc0 Compare April 4, 2026 10:10
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 4, 2026

Codecov Report

❌ Patch coverage is 78.75000% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.62%. Comparing base (022b565) to head (2438dc0).

Files with missing lines Patch % Lines
types/serialization.go 77.39% 32 Missing and 1 partial ⚠️
pkg/store/batch.go 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3225      +/-   ##
==========================================
+ Coverage   61.43%   61.62%   +0.18%     
==========================================
  Files         120      120              
  Lines       12504    12645     +141     
==========================================
+ Hits         7682     7792     +110     
- Misses       3960     3991      +31     
  Partials      862      862              
Flag Coverage Δ
combined 61.62% <78.75%> (+0.18%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant