Skip to content

[Feature] PPO + RND sota-implementation for MuJoCo#3905

Open
theap06 wants to merge 4 commits into
pytorch:mainfrom
theap06:feat/rnd-ppo-sota
Open

[Feature] PPO + RND sota-implementation for MuJoCo#3905
theap06 wants to merge 4 commits into
pytorch:mainfrom
theap06:feat/rnd-ppo-sota

Conversation

@theap06

@theap06 theap06 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Follow-up to #3889 (RND transform + loss): a sota-implementations example training PPO + Random Network Distillation on MuJoCo continuous-control tasks.

RND (Burda et al., 2018) augments the extrinsic reward with a curiosity bonus — the prediction error of a trainable network against a frozen random target on the next observation — driving exploration toward novel states.

What's added (sota-implementations/rnd/)

  • rnd_mujoco.py — PPO + RND training loop. The RNDTransform writes an intrinsic_reward at collection time; the script mixes it into the extrinsic reward before GAE, and trains the RND predictor via RNDLoss alongside each PPO update. Modeled on sota-implementations/ppo/ppo_mujoco.py (Hydra, optional torch.compile/cudagraphs).
  • utils_mujoco.py — env, PPO actor/critic, and RND target/predictor network builders.
  • config_mujoco.yaml — Hydra config (env, collector, PPO loss, RND, optim, logger).
  • README index entry.

Notes

  • Built against the current API (uses Collector, entropy_coeff/critic_coeff); the RNDTransform/RNDLoss signatures and the shared obs_rms match the merged [Feature] RND Implementation #3889.
  • Validated: compiles, all torchrl imports resolve, ruff + pre-commit clean. A full training run needs gymnasium[mujoco].

🤖 Generated with Claude Code

Adds sota-implementations/rnd: a PPO agent augmented with Random Network
Distillation (Burda et al., 2018) intrinsic rewards on MuJoCo continuous
control, a follow-up to the RND transform/loss (pytorch#3889).

- rnd_mujoco.py: PPO + RND training loop (mixes the RNDTransform's intrinsic
  reward into the extrinsic reward before GAE, and trains the RND predictor
  via RNDLoss alongside the PPO update).
- utils_mujoco.py: env, PPO actor/critic, and RND target/predictor builders.
- config_mujoco.yaml: Hydra config.
- README index entry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@pytorch-bot

pytorch-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3905

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 2 New Failures

As of commit 99d5e80 with merge base 1f0b769 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 23, 2026
@github-actions github-actions Bot added Feature New feature sota-implementations/ and removed Feature New feature labels Jun 23, 2026
@github-actions github-actions Bot added the Feature New feature label Jun 23, 2026
@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Benchmark Results: PR 99d5e802 vs main 1f0b7691

Benchmark run: https://github.com/pytorch/rl/actions/runs/28065269639

Higher ops/sec is better. Tables are sorted by largest absolute change.

CPU

Compared 216 benchmarks. Regressions over 5%: 5. Improvements over 5%: 14.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 396.79 1,873 +371.95%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 194.71 36.82 -81.09%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-backward] 54.47 88.43 +62.33%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2,796 3,565 +27.51%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,510 3,146 +25.35%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,108 2,505 -19.40%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 758.45 895.27 +18.04%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-same] 25.11 28.68 +14.20%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-same] 23.21 20.01 -13.78%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,736 3,095 +13.11%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 493.36 553.33 +12.15%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 458.83 509.09 +10.95%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 527.08 470.40 -10.75%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 51.99 56.41 +8.50%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1,939 2,073 +6.88%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 264.60 281.65 +6.44%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,690 1,795 +6.24%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,709 2,875 +6.10%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-64] 10.72 10.15 -5.36%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 36,935 38,757 +4.93%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 563.92 536.10 -4.93%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 779.38 743.35 -4.62%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 375,238 357,940 -4.61%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1,038 1,084 +4.37%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cpu] 94.85 98.62 +3.98%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-1] 280.40 269.32 -3.95%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,712 2,818 +3.94%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.3634 1.3101 -3.91%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 279.44 289.90 +3.74%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 28,154 29,189 +3.68%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 31,860 33,029 +3.67%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] 26,678 27,603 +3.46%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 2,196 2,270 +3.39%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 17,771 18,366 +3.34%
benchmarks/test_envs_benchmark.py::test_parallel 0.9768 0.9441 -3.34%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,795 2,887 +3.26%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 659.83 680.63 +3.15%
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] 8,012 7,761 -3.12%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 47.28 48.74 +3.10%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 174.31 169.00 -3.05%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] 48,931 50,400 +3.00%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 136.22 132.15 -2.99%
benchmarks/test_envs_benchmark.py::test_transformed 0.8857 0.9106 +2.82%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 8.7637 8.5168 -2.82%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-64] 12.41 12.76 +2.78%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 21,840 22,415 +2.64%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-gru] 1.3783 1.3427 -2.58%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] 30,809 31,598 +2.56%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 75,741 77,659 +2.53%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-lstm] 3.1354 3.0561 -2.53%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-16] 42.69 43.76 +2.51%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] 31,079 31,843 +2.46%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-None] 278.93 272.08 -2.45%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1,038 1,063 +2.43%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 53,603 54,886 +2.39%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-None] 48.12 49.28 +2.39%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.5915 0.6055 +2.37%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 20,057 20,530 +2.36%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 383.92 392.79 +2.31%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 63,133 64,587 +2.30%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] 38.38 37.50 -2.29%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 62,037 63,456 +2.29%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-False] 33,411 34,146 +2.20%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 40,928 41,821 +2.18%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-backward] 509.42 520.47 +2.17%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 407.86 416.61 +2.15%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 29,666 30,295 +2.12%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 638.39 624.89 -2.12%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 51.96 53.06 +2.11%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 52.61 53.72 +2.10%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 23,951 23,459 -2.05%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,805 1,842 +2.02%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 7,160 7,304 +2.01%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 58.98 57.81 -1.98%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 37,604 36,868 -1.96%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[False-backward] 55.11 54.06 -1.92%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-4] 164.88 161.73 -1.91%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-lstm] 0.8729 0.8567 -1.86%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 745.74 732.18 -1.82%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] 49,325 50,212 +1.80%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] 30,036 30,574 +1.79%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 962.91 979.52 +1.72%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] 27.45 27.92 +1.71%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 114.83 112.87 -1.71%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-None] 258.66 263.03 +1.69%
benchmarks/test_envs_benchmark.py::test_serial 0.5746 0.5841 +1.66%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.19 23.57 +1.63%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 702.05 713.42 +1.62%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 330.73 325.49 -1.58%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 25.73 25.32 -1.58%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-None] 693.90 704.55 +1.53%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 95.63 97.08 +1.52%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-16] 49.09 49.83 +1.51%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 687.86 677.48 -1.51%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 138.36 136.29 -1.49%
benchmarks/test_collectors_benchmark.py::test_sync 16.77 16.52 -1.49%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 703.34 692.92 -1.48%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 546.70 554.66 +1.46%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] 19,505 19,223 -1.45%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-gru] 1.4504 1.4305 -1.37%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 44,143 44,746 +1.37%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 194.65 197.25 +1.34%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] 108.99 107.54 -1.33%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 304.61 308.64 +1.32%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] 25.97 26.30 +1.30%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 165.32 163.22 -1.27%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-None] 475.35 469.46 -1.24%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-None] 331.07 335.10 +1.22%
benchmarks/test_objectives_benchmarks.py::test_redq_speed[reduce-overhead-None] 228.02 225.25 -1.22%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 168.50 166.46 -1.21%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 126.17 124.65 -1.20%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 282.41 279.10 -1.17%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 2.0172 1.9935 -1.17%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-480-640-16] 4.9317 4.8751 -1.15%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[reduce-overhead-None] 116.56 115.23 -1.14%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 28,098 28,417 +1.13%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-1] 628.01 635.08 +1.13%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] 41,880 42,350 +1.12%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 3.0282 2.9952 -1.09%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] 13.24 13.38 +1.06%
... ... ... Showing 120 of 216 comparisons, sorted by absolute change.

GPU

Compared 226 benchmarks. Regressions over 5%: 11. Improvements over 5%: 18.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 822.70 64.97 -92.10%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 28.79 54.38 +88.84%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 186.62 38.92 -79.15%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2,449 3,651 +49.12%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,345 3,171 +35.25%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1,885 2,311 +22.61%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 105.45 84.62 -19.76%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 860.91 1,008 +17.13%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,076 3,599 +17.00%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,165 2,670 -15.62%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,874 2,496 -13.14%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,967 2,599 -12.39%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2,946 2,613 -11.29%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 432.72 473.23 +9.36%
benchmarks/test_collectors_benchmark.py::test_sync_pixels 9.2737 10.14 +9.30%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,180 1,995 -8.50%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 257.28 278.97 +8.43%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 285.80 309.49 +8.29%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 182.98 197.66 +8.02%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,022 3,249 +7.53%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 377.75 405.74 +7.41%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 734.30 686.17 -6.55%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,037 1,908 -6.32%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 142.55 133.65 -6.24%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 22,634 24,017 +6.11%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 320.08 338.72 +5.82%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 48.24 50.77 +5.26%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.42 22.52 +5.15%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[reduce-overhead-None] 1,854 1,949 +5.13%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cuda_samp... 1,470 1,542 +4.92%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[generalized_advantage_estimate-False-1-512] 49.94 47.54 -4.81%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-480-640-1] 74.70 78.23 +4.73%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 275.36 288.28 +4.69%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.3101 1.3684 +4.45%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-memmap_cpu_storage_cud... 967.79 1,009 +4.30%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] 30,095 28,831 -4.20%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 0.5118 0.5331 +4.15%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 1,386 1,331 -4.01%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cuda] 2,232 2,321 +3.97%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 18,139 17,420 -3.96%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 177.76 170.74 -3.95%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 53.60 55.67 +3.87%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-4] 46.64 48.43 +3.83%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 411.73 427.30 +3.78%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 720.30 747.46 +3.77%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 83.75 86.67 +3.49%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.70 23.47 +3.37%
benchmarks/test_objectives_benchmarks.py::test_values[td_lambda_return_estimate-True-False] 12.56 12.13 -3.36%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,819 1,880 +3.34%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 244.54 252.68 +3.33%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] 53.44 55.21 +3.31%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 365.04 377.09 +3.30%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] 31,624 30,584 -3.29%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-backward] 70.23 72.46 +3.18%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 11,821 12,196 +3.17%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 650.80 670.79 +3.07%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-None] 112.72 116.08 +2.99%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[200-img_shape1-large_batch] 8.5402 8.2862 -2.97%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[torch.save] 7,049 7,247 +2.81%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 44,601 43,383 -2.73%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cpu_sampler] 87.46 89.84 +2.72%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 28,839 28,065 -2.68%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 397.32 407.96 +2.68%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 21.31 21.88 +2.67%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] 234.95 241.18 +2.65%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 747.39 727.81 -2.62%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 363,286 372,466 +2.53%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-1] 472.58 484.44 +2.51%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 28,031 27,334 -2.49%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-None] 508.66 520.82 +2.39%
benchmarks/test_envs_benchmark.py::test_serial 0.4252 0.4353 +2.37%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 825.63 844.83 +2.33%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 22.72 23.25 +2.31%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 162.65 166.27 +2.23%
benchmarks/test_collectors_benchmark.py::test_sync_preempt 10.26 10.48 +2.15%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-16] 49.22 48.18 -2.11%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 23.62 24.11 +2.11%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-1] 188.46 192.36 +2.07%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 64,055 62,754 -2.03%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 159.72 162.95 +2.02%
benchmarks/test_collectors_benchmark.py::test_async 10.85 11.07 +2.01%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] 41.01 41.82 +1.98%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 56,330 55,218 -1.97%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-4] 71.86 70.45 -1.96%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[reduce-overhead-None] 105.62 103.60 -1.91%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] 87.96 89.59 +1.86%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-None] 778.73 764.53 -1.82%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] 19,749 19,389 -1.82%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,569 3,507 -1.74%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] 401.01 394.03 -1.74%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 162.64 165.44 +1.72%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 451.29 458.61 +1.62%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 43.06 43.75 +1.61%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 899.12 884.68 -1.61%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 19,712 19,396 -1.60%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[reduce-overhead-None] 859.72 873.42 +1.59%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 32,722 32,208 -1.57%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 72.34 73.47 +1.56%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 0.5148 0.5227 +1.54%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-1] 520.51 512.53 -1.53%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 804.94 817.22 +1.53%
benchmarks/test_objectives_benchmarks.py::test_values[td1_return_estimate-False-False] 21.27 20.95 -1.50%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 34,250 33,747 -1.47%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 52,184 52,933 +1.44%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] 48,683 49,381 +1.43%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] 133.72 135.64 +1.43%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 828.31 840.14 +1.43%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 458.20 464.73 +1.43%
benchmarks/test_objectives_benchmarks.py::test_values[generalized_advantage_estimate-True-True] 50.42 49.71 -1.41%
benchmarks/test_objectives_benchmarks.py::test_values[td0_return_estimate-False-False] 12,025 12,193 +1.40%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-None] 644.67 653.54 +1.38%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 0.2128 0.2157 +1.36%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-64] 10.65 10.80 +1.36%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 75,915 74,883 -1.36%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] 26.65 26.29 -1.35%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 227.21 230.28 +1.35%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-None] 618.72 626.80 +1.31%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb_cuda[100-img_shape0-atari] 16.91 16.69 -1.30%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 167.21 169.36 +1.29%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-None] 345.95 350.34 +1.27%
... ... ... Showing 120 of 226 comparisons, sorted by absolute change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Feature New feature sota-implementations/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants