Skip to content

feat: add typed service locator to SimulationContext#5672

Merged
pv-nvidia merged 2 commits into
isaac-sim:developfrom
pv-nvidia:pv/service-locator
May 20, 2026
Merged

feat: add typed service locator to SimulationContext#5672
pv-nvidia merged 2 commits into
isaac-sim:developfrom
pv-nvidia:pv/service-locator

Conversation

@pv-nvidia
Copy link
Copy Markdown
Contributor

@pv-nvidia pv-nvidia commented May 18, 2026

Motivation

SimulationContext is the natural lifecycle owner for backend-specific caches (e.g. UsdRT stage handles, Fabric hierarchy data). Currently these either live as class-level globals (no lifecycle, leak across stages) or get baked directly into SimulationContext (pollutes it with backend imports).

Solution

Add a lightweight typed ServiceLocator exposed via SimulationContext.services. Backends register their own singletons using subscript syntax:

sim.services[FabricStageCache] = FabricStageCache(stage)
cache = sim.services[FabricStageCache]
del sim.services[FabricStageCache]  # closes and removes

All registered services are closed when clear_instance() is called. Exceptions during close are collected and raised after full teardown completes.

Design

  • Keyed by service class (typed retrieval)
  • Services with a close() method are automatically closed on deletion or teardown
  • close_all(caught_exceptions) always collects — no silent failures
  • Purely additive — no existing behavior changes

Downstream

This is used by the Fabric stage cache PR (#5676) to manage IFabricHierarchy handles per stage.

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Reviewed: 3ede0f91 (updated from fa15c6b8)

What Changed Since Last Review

Major refactor: ServiceLocator extracted to standalone module with improved API.

File structure:

  • service_locator.py (NEW): Standalone ServiceLocator class with subscript API
  • simulation_context.py: Now exposes services property instead of get_service()/set_service() methods
  • test_service_locator.py (NEW): Comprehensive unit tests (160 lines)
  • changelog.d/service-locator.rst (NEW): Changelog fragment

API changes:

# Old (method-based)
sim.get_service(FabricStageCache)
sim.set_service(FabricStageCache, cache)

# New (subscript-based via property)
sim.services[FabricStageCache]
sim.services[FabricStageCache] = cache
del sim.services[FabricStageCache]  # closes and removes

Error handling improved: clear_instance() now collects service close errors and raises aggregated RuntimeError at the end (after full cleanup completes).


Review Summary

Service Locator (service_locator.py):

  • Clean typed API with __getitem__, __setitem__, __delitem__, __contains__, pop()
  • _try_close() helper safely handles missing/non-callable close attributes
  • close_all() always collects exceptions (mandatory list arg)

Tests (test_service_locator.py):

  • Excellent coverage: registration, retrieval, deletion, pop, close_all
  • Edge cases: non-callable close property, missing keys, exception collection
  • Base class key test demonstrates interface-based registration

Integration (simulation_context.py):

  • services property exposes the locator cleanly
  • clear_instance() properly handles service errors with deferred raise

Changelog:

  • Good description of the feature and usage

Assessment

LGTM ✅ — Well-structured refactor. Standalone module is cleaner, subscript API is Pythonic, tests are comprehensive. Ready to merge.


Automated review by Isaac Lab Review Bot • Guidelines

@pv-nvidia pv-nvidia force-pushed the pv/service-locator branch 8 times, most recently from cd15f3a to 7606eb5 Compare May 18, 2026 14:16
@pv-nvidia pv-nvidia marked this pull request as ready for review May 18, 2026 17:04
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR adds a lightweight typed service locator (ServiceLocator) to SimulationContext, giving backends a clean way to register and retrieve lifecycle-managed singletons (e.g., FabricStageCache) without polluting the core context with backend-specific imports. The close_all path in clear_instance() is exception-safe and correctly raises after gc.collect() and the diagnostic log.

  • ServiceLocator provides subscript-based typed access (services[Cls] = instance, services[Cls], del services[Cls]) and a close_all(caught_exceptions) method that iterates a snapshot so all services are visited even when one raises.
  • SimulationContext.clear_instance() now drains the service registry before closing the stage, collecting errors and re-raising after teardown completes.

Confidence Score: 4/5

Safe to merge after addressing the replacement-without-closing gap; all other teardown paths are correct.

The setitem method silently drops the previous service without calling close() when a key is overwritten. Any backend that replaces a registered service mid-lifecycle will permanently leak the old instance - it is removed from _services before close_all() runs at teardown, so there is no recovery path. The PR description explicitly promises auto-close on replacement, making this an easy trap for consumers of the new API.

source/isaaclab/isaaclab/sim/service_locator.py - the setitem replacement behavior needs attention before this API is adopted by backends.

Important Files Changed

Filename Overview
source/isaaclab/isaaclab/sim/service_locator.py New typed service registry — close_all is now exception-safe and clears before iterating, but setitem silently drops the old service without calling close() on replacement, which leaks resources for any backend that replaces a service mid-lifecycle.
source/isaaclab/isaaclab/sim/simulation_context.py Adds _services: ServiceLocator field and services property; close_all is called before close_stage with errors collected, and the RuntimeError raise now correctly follows gc.collect() and the log message.
source/isaaclab/test/sim/test_service_locator.py Good coverage of get/set/del/pop/close_all paths, including exception-collection and non-callable close attributes; has a duplicate assertion on line 137 and is missing a test for the replacement-without-closing behavior.
source/isaaclab/changelog.d/service-locator.rst Changelog entry correctly documents the new ServiceLocator class and services property.

Sequence Diagram

sequenceDiagram
    participant Backend
    participant SimCtx as SimulationContext
    participant SL as ServiceLocator

    Backend->>SL: "services[FabricStageCache] = cache"
    Note over SL: _services[FabricStageCache] = cache

    Backend->>SL: services[FabricStageCache]
    SL-->>Backend: cache (or None)

    Note over SimCtx: clear_instance() called
    SimCtx->>SimCtx: physics_manager.close()
    SimCtx->>SimCtx: visualizers closed
    SimCtx->>SL: "close_all(caught_exceptions=[])"
    SL->>SL: "snapshot = list(_services.values())"
    SL->>SL: _services.clear()
    loop each service
        SL->>Backend: service.close()
    end
    SimCtx->>SimCtx: stage_utils.close_stage()
    SimCtx->>SimCtx: "_instance = None"
    SimCtx->>SimCtx: gc.collect()
    SimCtx->>SimCtx: logger.info(SimulationContext cleared)
    alt service_errors present
        SimCtx-->>Backend: raise RuntimeError
    end
Loading

Reviews (3): Last reviewed commit: "test: add service locator unit tests and..." | Re-trigger Greptile

Comment thread source/isaaclab/isaaclab/sim/service_locator.py Outdated
Comment thread source/isaaclab/isaaclab/sim/service_locator.py Outdated
@pv-nvidia pv-nvidia force-pushed the pv/service-locator branch 5 times, most recently from 6c1a84e to 94bef08 Compare May 18, 2026 17:45
@github-actions github-actions Bot added the isaac-mimic Related to Isaac Mimic team label May 18, 2026
@pv-nvidia pv-nvidia force-pushed the pv/service-locator branch from a5603f7 to 94bef08 Compare May 18, 2026 17:54
@pv-nvidia pv-nvidia self-assigned this May 18, 2026
@pv-nvidia pv-nvidia force-pushed the pv/service-locator branch 3 times, most recently from 92025fc to 7cadb41 Compare May 18, 2026 18:07
Comment thread source/isaaclab/isaaclab/sim/service_locator.py Outdated
Comment thread source/isaaclab/isaaclab/sim/simulation_context.py Outdated
@pv-nvidia pv-nvidia force-pushed the pv/service-locator branch 2 times, most recently from aa7507d to cebed66 Compare May 18, 2026 18:17
Comment thread source/isaaclab/isaaclab/sim/service_locator.py
@pv-nvidia pv-nvidia force-pushed the pv/service-locator branch from cebed66 to 3ede0f9 Compare May 18, 2026 22:59
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Well-designed addition of a typed service locator to SimulationContext. The implementation follows a clean pattern with proper lifecycle management and exception-safe teardown. The test coverage is comprehensive.

Strengths

  • Exception-safe teardown: close_all clears the registry first, then iterates over a copy, ensuring all services are visited regardless of failures
  • Clean API: Subscript syntax (services[cls] = instance) is intuitive and type-safe
  • Flexible keying: Allowing registration by interface or abstract base class enables dependency injection patterns
  • Comprehensive tests: 13 test cases covering all API methods and edge cases (non-callable close attributes, exceptions during close, etc.)

Minor Suggestions

1. Consider logging service close failures before raising

When clear_instance() encounters service close errors, it raises after teardown completes. Adding a warning log before the raise would help with debugging without requiring exception handling:

if service_errors:
    for err in service_errors:
        logger.warning(f"Service close failed: {err}")
    msg = f"SimulationContext.clear_instance(): {len(service_errors)} service(s) failed to close"
    raise RuntimeError(msg) from service_errors[0]

2. Type hint for __contains__ return

The __contains__ method could benefit from a return type hint for completeness:

def __contains__(self, cls: type) -> bool:

(Already present — disregard if this was added in a later commit)

Overall

This is a clean, well-tested implementation that solves a real problem (backend-specific caches leaking across stage lifecycles). The design choice to make replacement not auto-close the old service is reasonable — explicit lifecycle control is safer when callers may hold references.

Verdict: Looks good to merge ✅


Update (2523e18): All previous review concerns have been addressed:

  • Exception-safe teardown: close_all now clears the registry first, then iterates a copy with try/except around each service close — all services are visited regardless of failures
  • Docstring clarity: __setitem__ docstring now correctly states pop removes without closing
  • Fail-fast mode removed: No longer has a fail-fast path that could orphan services; always collects exceptions
  • GC and logging preserved: gc.collect() and "SimulationContext cleared" log now run before service error check, ensuring cleanup is logged even on errors
  • Explicit lifecycle by design: Replacement not auto-closing is intentional per maintainer discussion

No new issues introduced in the incremental diff. This looks good to merge. ✅

Copy link
Copy Markdown
Collaborator

@ooctipus ooctipus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pv-nvidia added 2 commits May 20, 2026 11:50
Add get_service(cls) / set_service(cls, instance) — a lightweight typed
singleton registry on SimulationContext, keyed by service class.

This lets backend-specific caches (e.g. Fabric hierarchy handles) register
themselves without polluting SimulationContext with backend-specific fields
or imports.  Services with a close() method are automatically closed:
- On replacement via set_service()
- On teardown via clear_instance()

No existing behavior changes — this is purely additive.
- 7 unit tests covering get_service, set_service, replacement lifecycle,
  close-on-clear_instance, multiple service types, and idempotent re-registration
- Changelog entry for the new service locator API
@pv-nvidia pv-nvidia force-pushed the pv/service-locator branch from 1b33f5e to 2523e18 Compare May 20, 2026 11:53
@pv-nvidia pv-nvidia merged commit 8b63997 into isaac-sim:develop May 20, 2026
62 of 65 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request infrastructure isaac-lab Related to Isaac Lab team isaac-mimic Related to Isaac Mimic team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants