Add cross-platform device support (Apple Silicon / MPS, CPU)#23
Open
MerrittLegacy wants to merge 1 commit into
Open
Add cross-platform device support (Apple Silicon / MPS, CPU)#23MerrittLegacy wants to merge 1 commit into
MerrittLegacy wants to merge 1 commit into
Conversation
Make the core pipeline run on Apple Silicon (MPS) and CPU in addition to CUDA, without regressing CUDA. Previously every torch.cuda.* call ran unconditionally and the device was effectively hardcoded, so the package failed to import/run on a Mac. - wrapper.py: resolve the requested device against availability (cuda -> mps -> cpu); a config that still says "cuda" transparently uses MPS on a Mac. Guard cuda.empty_cache/synchronize/ipc_collect and memory queries behind is_available() with MPS/CPU branches. Restore self.use_denoising_batch to honor the parameter (forcing it False broke img2img with "iteration over a 0-d tensor"). Skip xformers on Mac. - pipeline.py: fall back to time.perf_counter() when CUDA events are unavailable (MPS/CPU) instead of crashing on torch.cuda.Event. - preprocessing/base_orchestrator.py: guard cuda.synchronize() in cleanup. - setup.py: allow install on Mac (MPS/CPU torch); drop cuda-python and TensorRT extras on darwin; relax the CUDA-only hard requirement. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Makes the core StreamDiffusion pipeline run on Apple Silicon (MPS) and CPU in addition to CUDA, without regressing CUDA. Previously the package failed to import/run on a Mac because
torch.cuda.*calls were executed unconditionally and the device was effectively hardcoded.Why
On macOS there is no CUDA, so
torch.cuda.Event,torch.cuda.synchronize(),empty_cache(),get_device_properties(), etc. raise or are unavailable, crashing the pipeline before any inference can run. Mac users (e.g. TouchDesigner / StreamDiffusionTD on Apple Silicon) had no working path.Changes
wrapper.pydeviceagainst availability (cuda → mps → cpu); a config that still says"cuda"transparently uses MPS on a Mac, while CUDA machines are unaffected. (Previous code forcedcpuon any non-MPS box.)cuda.empty_cache/synchronize/ipc_collectand memory queries behindis_available(), with MPS branches (torch.mps.*).self.use_denoising_batchto honor the constructor parameter — it had been hardcoded toFalse, which breaks img2img withiteration over a 0-d tensor.xformerson Mac (unsupported).pipeline.py— fall back totime.perf_counter()for timing when CUDA events are unavailable (MPS/CPU) instead of constructingtorch.cuda.Event.preprocessing/base_orchestrator.py— guardcuda.synchronize()in cleanup.setup.py— allow install on macOS (MPS/CPU torch); dropcuda-pythonand TensorRT extras ondarwin; relax the CUDA-only hard requirement.Compatibility / risk
device="cuda"on a CUDA host still selects CUDA and allcuda.*calls run exactly as before.none).Notes for reviewers
Scoped to the core library only. TouchDesigner-integration changes (shared-memory transport on Mac, etc.) live outside this package and are not included here.
🤖 Generated with Claude Code