Add Remote GPU Rental Training rule (rented/spot instance ops)#318
Add Remote GPU Rental Training rule (rented/spot instance ops)#318Hanyuyuan6 wants to merge 2 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds a new ChangesRemote GPU Rental Training Rule
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
README.md (1)
261-261: 💤 Low valueMinor: Refine README description wording for clarity.
The README description says "spot-preemption resumable checkpointing," but this awkwardly combines two separate ideas. The rule file correctly lists them as "spot-preemption resilience, and resumable checkpointing." Consider updating the README to match:
"Run long GPU jobs on rented/remote instances (AutoDL, RunPod, vast.ai, Lambda, Slurm) with billing-safe teardown, spot-preemption resilience, resumable checkpointing, and disk/inode budgeting."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` at line 261, The README description for the Remote GPU Rental Training link awkwardly combines "spot-preemption resumable checkpointing" as a single concept. Update this phrase to separate the two distinct ideas by changing it to "spot-preemption resilience, resumable checkpointing" so it matches the correct wording in the actual rule file and improves clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@README.md`:
- Line 261: The README description for the Remote GPU Rental Training link
awkwardly combines "spot-preemption resumable checkpointing" as a single
concept. Update this phrase to separate the two distinct ideas by changing it to
"spot-preemption resilience, resumable checkpointing" so it matches the correct
wording in the actual rule file and improves clarity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: eb0353a8-b709-4855-a6b6-732efb9ce01c
📒 Files selected for processing (2)
README.mdrules/remote-gpu-rental-training.mdc
…heckpointing (review nitpick)
Adds
rules/remote-gpu-rental-training.mdcunder Build Tools and Development.The rule gives Cursor operating discipline for running long GPU jobs on rented/remote instances you don't own (AutoDL, RunPod, vast.ai, Lambda, Paperspace, Chinese platforms, bare SSH / Slurm / K8s) — the metered-tenant concerns the existing CUDA/Kubernetes/containerization rules don't cover: stop-vs-terminate billing safety, spot-preemption resilience, checkpoint-to-durable + idempotent resume, disk/inode budgeting, and a teardown gate that blocks
terminate/destroyuntil results are pulled and verified.It is a condensed, self-contained distillation of the open-source
remote-gpu-trainerAgent Skill (MIT, https://github.com/Hanyuyuan6/remote-gpu-trainer), authored by me — not a copy of the source docs. Frontmatter follows the README ## Contributing format (description/globs/alwaysApply: false); README entry added alphabetically in the correct category.Summary by CodeRabbit
Documentation