fix: don't close shared session store in runtime.Close#2879
Conversation
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
The fix correctly addresses the shared session store lifecycle issue. The pattern for lazy initialization in is a sound approach, is properly deferred in , and the contract update is clearly documented. No resource leaks, double-close, or use-after-close issues were identified in the changed code.
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
The fix correctly moves session-store ownership to the backend layer, resolving the use-after-close bug (#2872). One potential data race was found in the new localBackend.Close() method.
| } | ||
|
|
||
| func (b *localBackend) Close() error { | ||
| if b.store == nil { |
There was a problem hiding this comment.
[MEDIUM] Potential data race: Close() reads b.store outside sync.Once
localBackend.Close() reads b.store directly (line 130) without any synchronization, while sessionStore() writes b.store inside storeOnce.Do(...). The Go memory model only guarantees that the write to b.store inside the Do closure is visible after Do returns — it does not provide a happens-before guarantee for concurrent reads of b.store that happen outside of Do.
If Close() races with an in-progress storeOnce.Do() call (e.g., a session is still being initialized when shutdown begins), the Go race detector would flag this.
Suggested fix — either use storeOnce.Do to query the store in Close() as well, or protect b.store with a mutex:
func (b *localBackend) Close() error {
b.storeOnce.Do(func() {}) // ensure the write, if any, has completed
if b.store == nil {
return nil
}
return b.store.Close()
}Or, more robustly, add a sync.Mutex to guard access to b.store in both sessionStore and Close.
aheritier
left a comment
There was a problem hiding this comment.
Clean fix for the shared-store double-close bug (#2872). Ownership transfer to localBackend is the right architectural move, defer ordering in runOrExec is correct (LIFO ensures cleanup() runs before b.Close()), and the regression test in close_test.go directly exercises the fixed invariant. CI is green.
Two minor non-blocking observations (inline):
- Sticky
sync.Oncefailure — ifsessionStore()errors on first call theOnceis consumed; retries permanently get the cached error. Not a real problem given the current single-call graph, but worth noting for future callers. b.store == nilguard inClose()is technically unreachable after a successfulDo(), but it is correct and defensively handles the no-CreateSession-called path.
Pre-existing gap (out of scope): spawned runtimes' cleanup closure does not call localRt.Close(), so bgAgents.StopAll() is not invoked for spawned runtimes. Not introduced here; worth a follow-up issue.
| // first use. The store is shared by the initial runtime and by every | ||
| // runtime spawned by [localBackend.Spawner]. | ||
| func (b *localBackend) sessionStore(req runtime.CreateSessionRequest) (session.Store, error) { | ||
| b.storeOnce.Do(func() { |
There was a problem hiding this comment.
Sticky sync.Once failure (low severity): if ExpandHomeDir or NewSQLiteSessionStore errors here, storeOnce is permanently consumed and every subsequent call to sessionStore() will return (nil, storeErr) — even if the underlying problem has been corrected (e.g., a retry with a different req). In the current call graph CreateSession is called exactly once, so this is not a practical issue. Flagging in case future callers add retry logic.
| } | ||
|
|
||
| func (b *localBackend) Close() error { | ||
| if b.store == nil { |
There was a problem hiding this comment.
Nil guard is correct and defensive. b.store can only be nil here if CreateSession was never called (e.g., an early-exit code path) or if storeOnce recorded an error without setting b.store. The check is not racy: b.Close() is invoked from the defer registered after selectBackend(), by which point any concurrent Do() invocations have completed. Technically redundant on the happy path, but harmless and good defensive practice.
The TUI was failing to open
/sessionsafter switching to a session with a different working directory, with the error "failed to load sessions: sql: database is closed". This happened because when the TUI swapped in a new runtime for the different working directory, it canceled the old runtime's context and ran cleanup, which calledrt.Close()→r.sessionStore.Close(). However, the spawned new runtime was given the same SQLite store (viacreateSessionSpawnerreusingrt.SessionStore()), so closing it from the old runtime broke every subsequent read against that store.The fix moves session store ownership to the backend layer, which now owns its lifecycle independently. The
Runtime.Close()contract is updated to make explicit that the session store is owned by the embedder and may be shared.localBackendnow lazily opens the SQLite store on firstCreateSessioncall (viasync.Once) and closes it once in a newbackend.Close()method. The runtime no longer closes the store it was given, and the TUI defersb.Close()to clean up resources when the entire session ends.Fixes #2872