Skip to content

fix(v2): get benchmarks crates to compile#2618

Merged
shuklaayush merged 6 commits intodevelop-v2.0.0-betafrom
fix/v2-bench-crate
Mar 25, 2026
Merged

fix(v2): get benchmarks crates to compile#2618
shuklaayush merged 6 commits intodevelop-v2.0.0-betafrom
fix/v2-bench-crate

Conversation

@shuklaayush
Copy link
Copy Markdown
Collaborator

@shuklaayush shuklaayush commented Mar 24, 2026

Summary

Removes all openvm-native-circuit, openvm-native-recursion, and openvm-native-compiler dependencies from the three benchmark crates, which no longer exist in the workspace.

Changes by crate

benchmarks/utils

  • Deleted src/generate-fixtures.rs — depended on openvm-native-circuit, openvm-native-recursion, openvm-sdk
  • Cargo.toml: Removed all optional native-circuit deps (openvm-circuit, openvm-continuations, openvm-native-circuit, openvm-native-recursion, openvm-sdk, openvm-stark-sdk, bitcode, serde). Removed generate-fixtures feature and its [[bin]] entry. Set default = [].
  • src/lib.rs unchanged — ELF building utilities remain intact.

benchmarks/execute

  • Deleted src/execute-verifier.rs — the entire binary used NativeCpuBuilder and NATIVE_MAX_TRACE_HEIGHTS from native-circuit.
  • Deleted src/lib.rs — was a placeholder; the crate is bench-only.
  • Cargo.toml:
    • Removed openvm-continuations and openvm-sdk deps (only used by verifier code).
    • Removed [[bin]] for execute-verifier and the evm-prove feature.
    • Expanded tco and aot features to propagate to all extension circuit crates directly (previously went through openvm-sdk).
    • Added openvm-circuit with test-utils to dev-dependencies (for SystemParams::new_for_testing).
  • benches/execute.rs:
    • Removed all 6 verifier benchmarks (benchmark_leaf_verifier_*, benchmark_internal_verifier_*) and their setup functions (setup_leaf_verifier, setup_internal_verifier).
    • Removed #[cfg(feature = "aot")] branches and create_aot_instance/create_metered_aot_instance functions — kept only the interpreter-based execution paths.
    • Updated engine/params API: BabyBearPoseidon2EngineBabyBearPoseidon2CpuEngine, FriParameters::standard_fast()SystemParams::new_for_testing(21).
    • Removed unused imports: fs, Arc, ContinuationVmProof, Proof, NativeCpuBuilder, etc.
    • App execution benchmarks (benchmark_execute, benchmark_execute_metered, benchmark_execute_metered_cost) remain unchanged.

benchmarks/prove

Major restructuring to match the v2-proof-system reference pattern.

  • src/lib.rs rewritten (replaces src/util.rs):

    • BenchmarkCli with --max-segment-length and --app-only flags.
    • BenchmarkCli::run() dispatches to run_default_benchmark (full aggregation) or run_default_app_benchmark (app-only).
    • BenchmarkCli::apply_config() helper for shared segmentation logic.
    • run_benchmark() — full aggregation with Sdk::prove(), proof size metrics (total + zstd-compressed), and verification via verify_vm_stark_proof_decoded.
    • run_app_benchmark() — app-only proof with verify_app_proof.
  • Deleted src/util.rs — consolidated into lib.rs.

  • Bins rewritten to use pre-compiled ELFs via include_bytes! instead of runtime build_bench_program():

    • fibonacci.rs — loads ELF from guest/fibonacci/elf/, n=800k.
    • pairing.rs — BN254 pairing check.
    • ecrecover.rs — ECDSA secp256k1 recovery (5 signatures).
    • regex.rs — email regex matching with regex_email.txt fixture.
    • base64_json.rs, bincode.rs, rkyv.rs, revm_transfer.rs — serialization/EVM benchmarks.
  • New bins:

    • keccak.rs — keccak256 iteration benchmark (4096 iterations), builder-constructed config.
    • keccak_par.rs — parallel keccak proving with --concurrency flag, shared app_pk/app_vk across threads.
    • kitchen_sink.rs — restored (previously deleted due to native-circuit deps), now loads pre-compiled ELF.
  • Deleted bins:

    • fib_e2e.rs — replaced by fibonacci.rs (use --app-only for app-only mode).
    • verify_fibair.rs — depended on openvm-native-circuit and openvm-native-recursion.
    • async_regex.rsAsyncAppProver no longer exists in the SDK.
  • Cargo.toml:

    • Added openvm-sdk-config, openvm-verify-stark-host, zstd, p3-field.
    • Removed tokio, derive_more, rand.
    • Added openvm-stark-backend/parallel and openvm-stark-backend/jemalloc to parallel/jemalloc features.
    • Removed async feature.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@shuklaayush shuklaayush marked this pull request as ready for review March 25, 2026 14:07
@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

Comment thread .github/workflows/benchmarks.yml Outdated
@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment has been minimized.

@github-actions

This comment was marked as outdated.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 25, 2026

Claude finished @shuklaayush's task in 1m 53s —— View job


Code Review: fix(v2): get benchmarks crates to compile

  • Gather context and read changed files
  • Review benchmark utility changes
  • Review execute benchmark changes
  • Review prove benchmark changes (lib.rs, binaries, Cargo.toml)
  • Review CI/workflow changes
  • Post final review

Great cleanup — net ~1000 lines removed. Removes all stale openvm-native-circuit/openvm-native-recursion dependencies, consolidates shared proving logic into a clean BenchmarkCli abstraction, and re-enables the three benchmark crates. Benchmarks are running successfully on CI. The binary pattern (parse args → load ELF via include_bytes!args.run()) is consistent and clean.

Most prior review items have been addressed (unused openvm-continuations removed, test-utils dropped, kzg-params-dir added to BenchmarkCli, pull_request trigger uncommented, apply_config extracted). Two remaining items:

Issues

1. #[command(allow_external_subcommands = true)] on BenchmarkCli is misleadingbenchmarks/prove/src/lib.rs:24

This clap attribute allows unknown positional subcommands but does not silently ignore unknown --flags. Now that BenchmarkCli has all necessary fields (kzg_params_dir, profiling, app_only, max_segment_length), this attribute serves no useful purpose. If a typo is passed (e.g., --app-onyl), it could be silently swallowed as a subcommand positional arg rather than producing a helpful error. Consider removing it. Fix this →

2. kzg_params_dir field is accepted but never usedbenchmarks/prove/src/lib.rs:38-39

BenchmarkCli accepts --kzg-params-dir (fixing the earlier bench.py mismatch), but neither run(), run_benchmark(), nor run_app_benchmark() reads this field. The commented-out halo2_outer_k/halo2_wrapper_k fields (lines 40-44) suggest this is an intentional placeholder for future e2e halo2 proving. Adding a // TODO: wire up when evm-verify is ported to v2 comment on the field itself would clarify this isn't an oversight, matching the existing TODO comments in benchmark-call.yml:141-152.

Minor Nits

3. DEFAULT_LOG_STACKED_HEIGHT relationship to DEFAULT_MAX_SEGMENTbenchmarks/prove/src/lib.rs:20-21

pub const DEFAULT_MAX_SEGMENT: u32 = 1 << 20;     // 2^20
pub const DEFAULT_LOG_STACKED_HEIGHT: usize = 21;  // 2^21

The factor-of-2 gap between these related constants isn't explained. A brief comment (e.g., "stacked height is one log-step above max segment height to account for padding") would help future readers avoid accidentally changing one without the other.

4. TranspilerConfig import in keccak_par.rs is correctbenchmarks/prove/src/bin/keccak_par.rs:7

Confirmed: .transpiler() is called on line 43, so the TranspilerConfig trait import is needed. Same for lib.rs:8. No issue here.

Looks Good

  • BenchmarkCli with apply_config() + run() is well-structured — keccak_par reuses apply_config() for custom parallel logic
  • next_power_of_two() is uniformly applied in apply_config() for both default and user-provided values
  • keccak_par.rs per-thread Sdk::new is well-documented (line 59 comment about OnceLock and Clone/Sync)
  • Removing openvm-sdk/openvm-continuations deps from benchmarks/execute is correct
  • Feature flag expansion for tco/aot in execute's Cargo.toml properly propagates to all extension crates
  • Adding parallel/jemalloc features to openvm-stark-backend in prove's features is the right fix
  • All include_bytes! binaries follow a consistent, readable pattern
  • benchmarks/utils/Cargo.toml properly slimmed after removing generate-fixtures
  • CI workflow and benchmark-config.json correctly updated to match new binary names
  • pull_request trigger in benchmarks.yml is properly uncommented (commit 1dea025)
  • bench.py and BenchmarkCli CLI args are now in sync

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor

@stephenh-axiom-xyz stephenh-axiom-xyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits

Comment thread .github/workflows/benchmark-call.yml Outdated
Comment thread .github/workflows/benchmark-call.yml Outdated
Comment thread benchmarks/prove/Cargo.toml Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 25, 2026

Claude finished @shuklaayush's task in 1m 31s —— View job


  • Read changed files and gather context
  • Review benchmarks/prove/src/lib.rs (major rewrite)
  • Review benchmarks/prove binaries
  • Review benchmarks/execute changes
  • Review benchmarks/utils and CI changes
  • Post final review

Code Review: fix(v2): get benchmarks crates to compile

Clean, well-structured PR. Net ~1000 lines removed, stale native-circuit dependencies properly excised, and the new BenchmarkCli abstraction is a good pattern. Benchmarks are passing on CI. A few observations beyond what the previous review covered:

Issues

1. #[command(allow_external_subcommands = true)] should be removedbenchmarks/prove/src/lib.rs:24

Agree with the previous review. Now that BenchmarkCli has all needed fields, this attribute only masks typos (e.g., --app-onyl silently becomes a positional subcommand arg). Fix this →

2. execute.rs still has #[cfg(feature = "aot")] code but the PR description says AOT paths were removedbenchmarks/execute/benches/execute.rs:1-2,78-84,286-300,322-368,372-386

The PR description says "Removed #[cfg(feature = "aot")] branches and create_aot_instance/create_metered_aot_instance functions — kept only the interpreter-based execution paths." But the actual code still has these #[cfg(feature = "aot")] blocks, create_aot_instance, and create_metered_aot_instance. Either the description is stale or this was unintentionally left in. Since the aot feature is still defined in execute/Cargo.toml, this seems intentional — just the description is misleading.

3. kzg_params_dir accepted but unusedbenchmarks/prove/src/lib.rs:37-39

Field is parsed but never read by run(), run_benchmark(), or run_app_benchmark(). The commented-out halo2_outer_k/halo2_wrapper_k fields suggest this is a placeholder. A // TODO[INT-7054] comment (matching the workflow TODOs) on the field would clarify intent.

Nits

4. DEFAULT_MAX_SEGMENT vs DEFAULT_LOG_STACKED_HEIGHT relationshipbenchmarks/prove/src/lib.rs:20-21

pub const DEFAULT_MAX_SEGMENT: u32 = 1 << 20;     // 2^20
pub const DEFAULT_LOG_STACKED_HEIGHT: usize = 21;  // log2(2^21)

The factor-of-2 gap is not explained. A comment like // stacked height = log2(max_segment) + 1 to account for trace padding would prevent these from being changed independently.

5. Cargo.toml feature propagation looks cleanbenchmarks/prove/Cargo.toml

Re: @stephenh-axiom-xyz's question about openvm-stark-backend — the direct dep is needed because lib.rs:9 imports SystemParams from it. The parallel/jemalloc features are propagated through openvm-sdk already, so the separate openvm-stark-backend feature entries in parallel/jemalloc features are redundant (but harmless). Worth confirming if openvm-sdk already enables openvm-stark-backend/parallel and openvm-stark-backend/jemalloc transitively.

6. Binary pattern is clean and consistent

All prove binaries follow the same pattern: parse args → load config from toml/builder → decode ELF via include_bytes → args.run(). keccak_par.rs appropriately breaks out of this for its custom parallel logic while reusing apply_config(). Well done.

Looks Good

  • BenchmarkCli::apply_config() with next_power_of_two() is correct
  • run_benchmark() verification via verify_vm_stark_proof_decoded and proof size metrics behind #[cfg(feature = "metrics")]
  • run_app_benchmark() with verify_app_proof for app-only mode
  • keccak_par.rs per-thread Sdk::new with shared app_pk/app_vk is well-documented
  • Deletion of generate-fixtures.rs, execute-verifier.rs, verify_fibair.rs, fib_e2e.rs, async_regex.rs — all properly removed since their deps no longer exist
  • benchmarks/utils/Cargo.toml correctly slimmed
  • CI workflows and benchmark-config.json properly updated
  • bench.py and CLI args are in sync

@github-actions
Copy link
Copy Markdown

group app.proof_time_ms app.cycles leaf.proof_time_ms
fibonacci 4,150 12,000,265 1,352
keccak 19,188 1,235,218 3,378
regex 1,594 4,136,694 519
ecrecover 650 122,348 334
pairing 1,063 1,745,757 345
kitchen_sink 3,303 154,763 723

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: 12cffe8

Benchmark Workflow

@shuklaayush shuklaayush merged commit 7b04d7d into develop-v2.0.0-beta Mar 25, 2026
16 checks passed
@shuklaayush shuklaayush deleted the fix/v2-bench-crate branch March 25, 2026 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants