Imagine your city block USDZ is broken into many individually registered entities — building_A, building_B, streetlamp_01, car_01, etc. Each has a StreamingComponent that stores:
assetFilename/assetExtension— where the mesh lives on diskstreamingRadius— how close the camera must be to load itunloadRadius— how far before it gets unloadedpriority— which buildings load first when slots are contestedstate—.unloaded,.loading,.loaded, or.unloading
The engine calls this every frame. Here's what happens:
The system normally does real work every 0.1 seconds (updateInterval). Between ticks, it's a no-op. This prevents wasting CPU every single frame.
When lastPendingLoadBacklog > 0 (candidates are queued but all slots are busy), the effective interval drops to burstTickInterval (default 16 ms). This prevents a 100 ms stall between slot pickups during active loading. The tick rate returns to 100 ms once the backlog drains.
OS pressure bypass — if a pendingPressureRelief flag is set (fired by the OS pressure callback on a background queue), the throttle check is bypassed entirely for that call. This guarantees eviction runs within one frame (≤ 11 ms at 90 fps) rather than waiting up to 100 ms for the next normal tick. Without this, a .critical signal arriving right after a tick would sit unprocessed for the full throttle interval — longer than visionOS's kill window.
Instead of checking all 500 city entities, it asks the OctreeSystem:
"Give me every entity within 500m of the camera."
This is the key performance trick — only nearby entities are evaluated.
For each entity the octree returns, the system calculates the distance from camera to the entity's bounding box center, then:
| State | Condition | Action |
|---|---|---|
.unloaded |
distance ≤ streamingRadius |
→ add to load candidates |
.loaded |
distance > unloadRadius |
→ add to unload candidates |
.loaded |
still in range | → stamp lastVisibleFrame (keep alive) |
.loading / .unloading |
— | skip, already in progress |
The octree query only covers nearby space. But what if building_Z was loaded and the player sprinted far away — it might not be in the octree result anymore. So the system also checks its loadedStreamingEntities tracking set for any loaded entity not in the octree result, and adds those to unload candidates if they're too far.
Unload candidates are sorted farthest-first (most wasteful memory first). Up to maxUnloadsPerUpdate = 12 are processed per tick to avoid frame spikes.
unloadMesh() does:
- Sets state →
.unloading - Notifies
BatchingSystemthe entity is retiring - Cancels any in-flight load task
- Calls
MeshResourceManager.shared.release(entityId:)— decrements reference count on the cached mesh - Clears
render.mesh = []— the GPU buffers are not destroyed (cache still owns them) - Clears LOD level meshes if applicable
- Unregisters from
MemoryBudgetManager - Sets state →
.unloaded - Fires an
AssetResidencyChangedEvent(isResident: false)
Load candidates are sorted by priority then distance (high priority + closest first). Only maxConcurrentLoads = 3 can be active simultaneously.
Before dispatching, the scheduler applies three guards in order:
- CPU-entry readiness — OOC entities whose
CPUMeshEntryis not yet stored inProgressiveAssetLoaderare skipped. This prevents pre-streaming stubs from holding slots while registration is still running. - Prewarm-active deferral — entities for roots whose background texture prewarm is still running are skipped. Dispatching while the prewarm holds the per-asset texture lock would block all concurrent slots for the remaining prewarm duration. Slots stay free until
isPrewarmActivereturnsfalse. - Per-candidate geometry budget check — if the candidate's estimated GPU footprint would exceed the geometry budget,
evictLRUis called first.
When all near-band candidates share one assetRootEntityId, the near-band concurrency limit expands from nearBandMaxConcurrentLoads to maxConcurrentLoads. All sub-meshes of one USDZ are treated as a single burst rather than being serialized one-at-a-time.
loadMesh() does:
- Reserves a slot in
activeLoads(thread-safe viaNSLock) - Sets state →
.loading - Notifies
BatchingSystemstreaming started - Spawns a Swift
Task(runs off the main thread)
Inside the async task:
- If the entity has a
LODComponent→ callsreloadLODEntity()which loads all LOD levels - Otherwise → calls
loadMeshAsync()which goes toMeshResourceManager(cache-first, file fallback) - After loading, back on the main thread via
withWorldMutationGate:- Assigns
render.meshwith fresh copies of uniform buffers (critical — prevents entities sharing GPU state from overwriting each other) - Sets state →
.loaded - Fires
AssetResidencyChangedEvent(isResident: true) - Records load in
MemoryBudgetManager
- Assigns
For LOD entities (e.g., a skyscraper with 3 detail levels), it:
- Loads all LOD levels concurrently from cache/disk
- Calculates current camera-to-entity distance
- Picks the appropriate LOD level (highest detail that fits distance)
- Sets
renderComponent.meshto that LOD's mesh data - Marks
lodComponent.currentLOD
The engine uses two independent memory pressure signals and responds to them in priority order:
| Pressure signal | Method | Meaning |
|---|---|---|
| Combined | shouldEvict() |
Geometry pool ≥ 85% of geometryBudget OR texture pool ≥ 85% of textureBudget |
| Geometry only | shouldEvictGeometry() |
Mesh allocations alone ≥ 85% of geometryBudget |
Why two signals? TextureStreamingSystem upgrades visible textures to higher resolutions after meshes load. Those upgrades increase totalTextureMemory in MemoryBudgetManager. If the load gate used the combined signal, texture upgrades on already-loaded meshes would silently prevent new mesh loads — even when the geometry-only footprint is well within budget. The split pools (geometryBudget + textureBudget) ensure each domain has an independent ceiling so neither can starve the other.
Before considering geometry eviction, the system sheds texture quality on the farthest loaded entities:
if combined pressure is high AND geometry pressure is NOT high:
TextureStreamingSystem.shedTextureMemory(maxEntities: 4)
→ no geometry eviction; texture relief only
shedTextureMemory forces the farthest entities in the upgradedEntities set to minimumTextureDimension immediately, bypassing the normal distance-band schedule. A distant wall dropping from 1024 px to 256 px is far less noticeable than a missing mesh.
Only triggered when geometry memory itself hits the high-water mark:
if geometry pressure is high:
TextureStreamingSystem.shedTextureMemory(maxEntities: 8) ← try texture relief first
evictLRU(cameraPosition:) ← then fall back to geometry eviction
evictLRU:
- First evicts unused cached meshes (
MeshResourceManager.evictUnused()) - Collects all loaded streaming entities
- Sorts by value score (far + large = first to go; see value-score eviction in the out-of-core walkthrough)
- Unloads them one by one until geometry-only pressure clears (loop breaks on
shouldEvictGeometry(), not the combined signal) - Skips entities that are both visible and within
visibleEvictionProtectionRadius(30 m default) - Accepts an optional
maxEvictionscap (defaultInt.max). The OS pressure path passes16per call — this bounds single-frame work during a burst. Any remaining candidates spill to subsequent ticks.
The sizeFactor in the eviction score is normalized against geometryBudget (not the combined budget), so a mesh consuming 80% of the geometry pool scores correctly rather than appearing to consume only ~48% of a combined total.
In addition to the per-tick budget checks above, MemoryBudgetManager subscribes to OS memory pressure events via DispatchSource.makeMemoryPressureSource:
| OS signal | Response | maxEntities |
|---|---|---|
.warning |
Texture shed | 8 |
.critical |
Texture shed + double geometry eviction pass (capped at 16 per pass) + CPU heap release | 20 |
The OS callback fires on a background queue and sets a pendingPressureRelief flag on GeometryStreamingSystem. The flag is drained at the start of the next update() tick on the main thread, so all eviction work stays on the same thread as the rest of the streaming system. This prevents the OS from silently escalating to .critical and terminating the process — on visionOS in particular, the window between .warning and process kill can be under a second.
CPU heap release on critical pressure — evictLRU only frees GPU Metal buffers tracked by MemoryBudgetManager. The OS measures total process memory, which includes ProgressiveAssetLoader.rootAssetRefs (the live MDLAsset tree and all child CPUMeshEntry vertex/index buffers). For a 500-building scene this CPU heap can reach hundreds of megabytes. On .critical, after the two geometry eviction passes, GeometryStreamingSystem calls ProgressiveAssetLoader.shared.releaseWarmAsset(rootEntityId:) on every warm root. This frees the CPU heap immediately. The rehydration context (asset URL + loading policy) is retained, so a cold re-stream from disk is transparent when the camera re-approaches.
Player spawns at corner of city block
│
├─ Frame 1 tick: Octree finds 8 nearby buildings
│ ├─ 5 are unloaded + within streamingRadius → load candidates
│ └─ 3 are loading already → skip
│
├─ Up to 3 async loads fire simultaneously
│ ├─ building_A: cache miss → read from USDZ file
│ ├─ building_B: cache hit → instant
│ └─ building_C: cache miss → read from USDZ file
│
├─ Player walks forward → building_K enters range
│ └─ Queued in load candidates (backlog until a slot frees)
│
├─ Player runs past old buildings → building_A now > unloadRadius
│ └─ render.mesh cleared, reference released, memory freed
│
└─ Memory pressure → LRU eviction kicks in
└─ building_E (not visible, oldest lastVisibleFrame) → evicted
The key design decisions here are:
- Octree spatial query prevents O(n) entity iteration every tick
- Concurrency cap (3) prevents GPU/IO saturation during fast movement
- Adaptive tick rate — 16 ms during backlog, 100 ms steady-state — prevents stalls between slot pickups without wasting CPU when idle
- Single-root burst detection — when all near-band candidates are sub-meshes of one asset, concurrency expands to the global cap so the asset loads in parallel rather than one mesh at a time
- Background texture prewarm —
loadTextures()runs at registration time so the first-upload path is a no-op and lock wait ≈ 0 - Prewarm-active deferral — dispatch is held until the prewarm releases the texture lock, keeping all slots free for the burst
- Narrowed texture lock scope — the per-asset lock covers only
ensureTexturesLoaded;makeMeshesFromCPUBuffersruns outside the lock so all slots upload in parallel - CPU-entry readiness guard — stubs registered before their CPU data is ready are skipped rather than wasting a slot
- Unload-before-load ordering ensures you free memory before consuming more
- Cache ownership means unloading just clears references, actual GPU memory is reused if the same mesh comes back into range
- Geometry-only load gate prevents texture upgrades from blocking mesh loads — each domain is budgeted independently
- Texture relief before geometry eviction means a drop in distant texture resolution is always preferred over a missing mesh
- Split geometry/texture pools (
geometryBudget+textureBudget) give each domain an independent ceiling and high-water mark — a texture-heavy scene cannot crowd out geometry loads and vice versa - Runtime device budget probing —
geometryBudgetandtextureBudgetare derived at init fromMTLDevice.recommendedMaxWorkingSetSize(macOS) oros_proc_available_memory()(visionOS/iOS) rather than hardcoded platform defaults; budgets adapt to actual device headroom - SceneRootTransform consistency — all distance calculations (GeometryStreamingSystem, LODSystem, inline LOD upload helpers) pass camera position through
SceneRootTransform.shared.effectiveCameraPosition()so XR physical-head movement and scene-root translations are applied uniformly; rawcameraComponent.localPositionis never used directly for distance math - Camera sync always runs —
syncStreamingCameraPosition()executes every frame regardless of theloadingflag; decoupling it from the loading guard prevents the streaming camera from freezing while an asset load is in flight - OS memory pressure subscription —
DispatchSource.makeMemoryPressureSourcefires proactive texture shedding and geometry eviction before the OS escalates to process termination; the response runs on the nextupdate()tick to stay single-threaded - evictLRU per-call cap — the
maxEvictionsparameter (defaultInt.max) bounds single-frame eviction work; the OS pressure path uses 16 per pass so a.criticalburst doesn't spike one frame; remaining candidates spill to subsequent ticks - CPU heap release on critical pressure — on
.critical, after geometry eviction,ProgressiveAssetLoader.releaseWarmAsset()is called for every warm root, freeing the MDLAsset CPU heap the OS measures; rehydration context survives so cold re-stream from disk is transparent