Adds a --bootstrap histogram to rcp_checker/visualization_scripts/rcp_viewer.py by matthew-frank · Pull Request #465 · mlcommons/logging

matthew-frank · 2026-05-27T18:23:39Z

No description provided.

--jackknife GBS restricts output to the single real (non-interpolated) RCP at the given global batch size, validating it against the full measured set (so pruned-out batch sizes are still accepted), and also prints the benchmark's submission_runs count. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

When --jackknife is given, resample the reference convergence runs 1000 times (drawing submission_runs values with replacement), take a trimmed mean (trim ceil(10%) from each end), and print an ASCII histogram of the resulting score distribution. Add --seed for reproducible output. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The resampling draws with replacement, which is a bootstrap, not a jackknife, so name it accurately. Rewrite the flag help to lead with its real purpose (producing the score histogram) rather than the output restriction, and increase the resample count from 1000 to 10000 for a smoother distribution. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add two summary lines to --bootstrap output: max_speedup (RCP mean / RCP min, the largest score ratio achievable from lucky-fast convergence) and P(score < min), the measured fraction of bootstrap scores falling below the RCP min. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

github-actions · 2026-05-27T18:23:52Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

ShriyaRishab

This bootstrapping helps understand the spread of submission scores based on the RCPs without actually running infinite submission scores by sampling from the RCP data. It does not account for the trimmed olympic mean and for the fact that our data is not continuous (we do eval only based on eval intervals and not at each step, thus making it discrete).

That said, this visualization can help understand the t-test better so it is approved.

matthew-frank and others added 4 commits May 27, 2026 11:38

matthew-frank requested review from a team as code owners May 27, 2026 18:23

ShriyaRishab approved these changes May 27, 2026

View reviewed changes

ShriyaRishab merged commit 56e5bcd into mlcommons:master May 27, 2026
2 checks passed

github-actions Bot locked and limited conversation to collaborators May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds a --bootstrap histogram to rcp_checker/visualization_scripts/rcp_viewer.py#465

Adds a --bootstrap histogram to rcp_checker/visualization_scripts/rcp_viewer.py#465
ShriyaRishab merged 4 commits into
mlcommons:masterfrom
matthew-frank:matthew-frank/rcp-jackknife

matthew-frank commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

ShriyaRishab left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

matthew-frank commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

ShriyaRishab left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants