add bootstrap histograms for RCPs to rcp_checker/visualization_scripts/rcp_viewer.py#464
Closed
matthew-frank wants to merge 4 commits into
Closed
add bootstrap histograms for RCPs to rcp_checker/visualization_scripts/rcp_viewer.py#464matthew-frank wants to merge 4 commits into
matthew-frank wants to merge 4 commits into
Conversation
--jackknife GBS restricts output to the single real (non-interpolated) RCP at the given global batch size, validating it against the full measured set (so pruned-out batch sizes are still accepted), and also prints the benchmark's submission_runs count. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When --jackknife is given, resample the reference convergence runs 1000 times (drawing submission_runs values with replacement), take a trimmed mean (trim ceil(10%) from each end), and print an ASCII histogram of the resulting score distribution. Add --seed for reproducible output. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The resampling draws with replacement, which is a bootstrap, not a jackknife, so name it accurately. Rewrite the flag help to lead with its real purpose (producing the score histogram) rather than the output restriction, and increase the resample count from 1000 to 10000 for a smoother distribution. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add two summary lines to --bootstrap output: max_speedup (RCP mean / RCP min, the largest score ratio achievable from lucky-fast convergence) and P(score < min), the measured fraction of bootstrap scores falling below the RCP min. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
MLCommons CLA bot: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.