perf(core): Lazily allocate AutoClosableReentrantLock (JAVA-588)#5643
Merged
Conversation
📲 Install BuildsAndroid
|
Contributor
Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| abfcc92 | 304.04 ms | 370.33 ms | 66.29 ms |
| 4e3e79d | 365.83 ms | 477.62 ms | 111.79 ms |
| d8912da | 329.94 ms | 389.68 ms | 59.74 ms |
| d15471f | 294.13 ms | 399.49 ms | 105.36 ms |
| abf451a | 332.82 ms | 403.67 ms | 70.85 ms |
| 604a261 | 380.65 ms | 451.27 ms | 70.62 ms |
| 806307f | 357.85 ms | 424.64 ms | 66.79 ms |
| 22f4345 | 307.87 ms | 354.51 ms | 46.64 ms |
| d217708 | 375.27 ms | 415.68 ms | 40.41 ms |
| c3ee041 | 310.64 ms | 361.90 ms | 51.26 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| abfcc92 | 1.58 MiB | 2.13 MiB | 557.31 KiB |
| 4e3e79d | 0 B | 0 B | 0 B |
| d8912da | 0 B | 0 B | 0 B |
| d15471f | 1.58 MiB | 2.13 MiB | 559.54 KiB |
| abf451a | 1.58 MiB | 2.20 MiB | 635.29 KiB |
| 604a261 | 1.58 MiB | 2.10 MiB | 533.42 KiB |
| 806307f | 1.58 MiB | 2.10 MiB | 533.42 KiB |
| 22f4345 | 1.58 MiB | 2.29 MiB | 719.83 KiB |
| d217708 | 1.58 MiB | 2.10 MiB | 532.97 KiB |
| c3ee041 | 0 B | 0 B | 0 B |
Previous results on branch: no/java-588-lazy-lock-allocation
Startup times
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| e8285db | 356.59 ms | 441.60 ms | 85.01 ms |
| 76a23f0 | 324.81 ms | 408.44 ms | 83.63 ms |
| 9c7fc8a | 330.67 ms | 380.76 ms | 50.09 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| e8285db | 0 B | 0 B | 0 B |
| 76a23f0 | 0 B | 0 B | 0 B |
| 9c7fc8a | 0 B | 0 B | 0 B |
6ee6062 to
04091dd
Compare
04091dd to
10f2b12
Compare
AutoClosableReentrantLock extended ReentrantLock, so every SDK object holding one allocated a ReentrantLock (and its AbstractQueuedSynchronizer) eagerly in its field initializer. A customer Perfetto trace showed ~81 such allocations on the main thread during SentryAndroid.init, many for locks that are never acquired during init. Hold the ReentrantLock internally and create it lazily on first acquire(), using an AtomicReferenceFieldUpdater CAS so creation stays atomic and Loom-friendly (no synchronized, preserving #3715). Every call site uses acquire() only, so dropping the ReentrantLock superclass touches no caller. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…internal (JAVA-588) Replace the unreachable candidate fallback after a failed CAS with an explicit non-null check, so a broken invariant fails loudly instead of handing two threads different locks. Mark the class @ApiStatus.Internal and make the lazy-allocation test assert the lock field directly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
10f2b12 to
f7075cb
Compare
adinauer
approved these changes
Jul 2, 2026
adinauer
left a comment
Member
There was a problem hiding this comment.
While the changes LGTM, we should consider changing this in the next major since it's a breaking change for anyone using the AutoClosableReentrantLock. However the majority of users will likely prefer to just have this improvement and even if someone's using it they'll likely be using the same APIs that still exist that we use so approving.
Every acquire() allocated a fresh lifecycle token, which is per-use garbage on every lock acquisition forever, not just at init. The token was stateless apart from its lock reference, so AutoClosableReentrantLock now implements ISentryLifecycleToken itself and acquire() returns this, making the steady-state acquire/close path allocation-free. Semantics are unchanged: try-with-resources closes once per acquire, so reentrant acquires stay balanced, and unlocking without holding the lock still throws IllegalMonitorStateException. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📜 Description
io.sentry.util.AutoClosableReentrantLockextendedjava.util.concurrent.locks.ReentrantLock, and ~57 SDK classes create one eagerly in a field initializer. Constructing the SDK object graph therefore allocated aReentrantLock(plus itsAbstractQueuedSynchronizerSync) per object — even for objects whose lock is never acquired.This change holds the
ReentrantLockinternally and creates it lazily on the firstacquire(), using anAtomicReferenceFieldUpdaterCAS so the lazy creation is atomic and stays Loom-friendly (nosynchronized, preserving the intent of #3715). If the CAS loses the first-acquire race, the caller uses the winner's lock; that invariant is enforced with an explicit non-null check so a broken invariant fails loudly instead of silently handing two threads different locks.On top of the lazy lock,
AutoClosableReentrantLocknow implementsISentryLifecycleTokenitself andacquire()returnsthis, eliminating the per-acquire lifecycle-token allocation entirely. That allocation happened on every lock use forever, not just at init, so the steady-state acquire/close path is now allocation-free. Semantics are unchanged: try-with-resources closes once per acquire (reentrant acquires stay balanced) and unlocking without holding the lock still throwsIllegalMonitorStateException.💡 Motivation and Context
Part of the Reduce SDK init time [Android] effort (JAVA-588). A customer-provided Perfetto trace showed ~81
ReentrantLockallocations on the main thread underSentryAndroid.init, contributing GC pressure and main-thread CPU. With lazy allocation, only locks actually acquired during init allocate; the many never-contended locks no longer allocate at construction.Compatibility note: this is technically a binary-compatibility change —
AutoClosableReentrantLockno longer extendsReentrantLock, so the inheritedReentrantLockpublic methods leave its API surface (reflected insentry.api). A usage audit confirmed every call site uses onlyacquire()(try-with-resources); nothing callslock()/unlock()/tryLock()/isHeldByCurrentThread()directly or types a field/param asReentrantLock, so no caller is affected. The class is now also marked@ApiStatus.Internalto make explicit that it carries no compatibility guarantees for external consumers.An on-device A/B benchmark on a Pixel 3 (run locally, not included in this PR) quantified the allocation reduction and confirmed
acquire()hot-path throughput is unchanged — see results below.📈 Benchmark results (Pixel 3, Android 12)
Measured on a release build using ART instrumented method tracing, with slices counted in Perfetto
trace_processor(btrace app-tracing is broken on this device). The A/B was done by swapping onlyAutoClosableReentrantLockbetween the old (extends ReentrantLock) and new (lazy) implementations and rebuilding; everything else is identical.Real
SentryAndroid.init(cold) — lock allocationsReentrantLock(+AbstractQueuedSynchronizer) allocatedacquire()calls70 fewer
ReentrantLockallocations per init (140 fewer heap objects: 70 locks + 70Sync), a ~65% reduction. Counts were identical across 3 runs of each build. Both builds make the same 40acquire()calls, so the init code path is unchanged — only never-contended locks stop allocating (37 distinct locks are acquired during init vs 107 created). (This benchmark app allocates 107; the customer trace above showed ~81.)Construction cost (per lock): old ~73–78 ns/alloc, new ~60–66 ns/alloc → ~11.5 ns saved per allocation. Across the 70 avoided allocations that is ≈0.8 µs of direct allocation work per init — below the cold-start noise floor. The benefit is therefore reduced allocation/GC pressure, not a measurable init-latency win.
acquire()hot path: ~161 ns/op foracquire()+close()when the lock already exists — unchanged; the lazy path adds only a volatile read + null check after the first acquire.💚 How did you test it?
Unit tests in
AutoClosableReentrantLockTest(lazy creation asserted directly against the underlying lock field via a@TestOnlyaccessor, token identity —acquire()returns the lock itself, reentrancy, and an 8-thread × 1000-iteration mutual-exclusion stress test); full:sentry:testsuite green.📝 Checklist
sendDefaultPIIis enabled.🔮 Next steps
Biggest remaining main-thread init cost in the same trace is Manifest parsing (~147ms) — tracked separately (SDK-1322 / JAVA-531, compile-time injection).
Fixes JAVA-588