Fix:unbounded sql pools by gurudatta-patil · Pull Request #9891 · temporalio/temporal

gurudatta-patil · 2026-04-09T16:25:30Z

What changed?

DatabaseHandle.reconnect() in db_handle.go had two closely related bugs that caused an unbounded accumulation of *sql.DB pools during sustained DB unavailability:

Pool destroyed before throttle check. The old pool was nil'd out (h.db.Store(nil)) before the throttle check. When throttled, no new pool was created, so h.db stayed nil for the entire 1-second window. Every caller in that window got DatabaseUnavailableError, triggering another ConvertError → reconnect(true) → destroy pool → throttled → nil again — a loop that lasted the entire outage.
New pool created before old one closed. On each un-throttled reconnect, a fresh *sql.DB was opened while the previous one was closed asynchronously. During a 2-3 minute outage (~150 throttle windows), ~150 generations of pools accumulated. On recovery, all of them raced to open connections simultaneously, blowing through maxConns by a factor of ~150.

The fix: move the nil + go prevConn.Close() to after a successful new connection is established, and return the existing pool when throttled rather than returning nil.

Why?

Fixes #9747

How did you test it?

built
added new unit test(s) — TestReconnectPoolAccumulationDuringOutage and TestReconnectNilPoolOnThrottle in db_handle_test.go directly reproduce both failure modes

Potential risks

The old pool is now kept alive until a successful reconnect, so callers may briefly continue to use a pool whose connections are failing rather than getting DatabaseUnavailableError immediately. This is intentional — ConvertError still detects individual connection errors, retriggers reconnect(true), and the throttle ensures we attempt at most one reconnect per second. The overall behaviour is strictly better under sustained outages.

main

gurudatta-patil added 2 commits April 8, 2026 14:16

unbounded sql pools

6053e89

Merge branch 'main' into fix/unbounded-sql-pools

83ff3bf

main

gurudatta-patil requested review from a team as code owners April 9, 2026 16:25

Merge branch 'main' into fix/unbounded-sql-pools

43af0a2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix:unbounded sql pools#9891

Fix:unbounded sql pools#9891
gurudatta-patil wants to merge 3 commits intotemporalio:mainfrom
gurudatta-patil:fix/unbounded-sql-pools

gurudatta-patil commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gurudatta-patil commented Apr 9, 2026

What changed?

Why?

How did you test it?

Potential risks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant