Test | Fix Transient Fault handling and other flaky unit tests#4080
Merged
Test | Fix Transient Fault handling and other flaky unit tests#4080
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the simulated TDS server test infrastructure and transient-fault unit tests to avoid flaky assertions on exact PreLogin counts on .NET Framework where internal TNIR/interval-timer retries can introduce extra pre-login attempts.
Changes:
- Add
Login7Counttracking toGenericTdsServerand adjustTransientTdsErrorTdsServerto increment it when bypassingbase.OnLogin7Request. - Introduce
AbandonedPreLoginCount(PreLogins not followed by Login7) for more deterministic test assertions. - Remove
flakytraits from transient-fault simulated server tests and update assertions to exclude “abandoned” prelogins.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Microsoft.Data.SqlClient/tests/tools/TDS/TDS.Servers/TransientTdsErrorTdsServer.cs | Increments Login7 counter on error path where base handler is skipped. |
| src/Microsoft.Data.SqlClient/tests/tools/TDS/TDS.Servers/GenericTdsServer.cs | Adds Login7 and abandoned-prelogin counting to support deterministic login-attempt assertions. |
| src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionTests.cs | Updates transient-fault assertions to use adjusted login-attempt counting; removes flaky traits. |
| src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionRoutingTestsAzure.cs | Updates routing transient-fault assertions to use adjusted login-attempt counting; removes flaky trait. |
| src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionRoutingTests.cs | Updates routing transient-fault assertions to use adjusted login-attempt counting; removes flaky trait. |
| src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionFailoverTests.cs | Updates transient-fault login-attempt assertion to use adjusted counting; removes flaky trait. |
src/Microsoft.Data.SqlClient/tests/tools/TDS/TDS.Servers/GenericTdsServer.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionTests.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionRoutingTests.cs
Show resolved
Hide resolved
...Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionRoutingTestsAzure.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionFailoverTests.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/tests/UnitTests/Microsoft/Data/Common/MultipartIdentifierTests.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/tests/tools/TDS/TDS.Servers/GenericTdsServer.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionFailoverTests.cs
Show resolved
Hide resolved
paulmedynski
previously approved these changes
Mar 24, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4080 +/- ##
==========================================
- Coverage 73.22% 68.42% -4.81%
==========================================
Files 280 275 -5
Lines 43000 66924 +23924
==========================================
+ Hits 31486 45791 +14305
- Misses 11514 21133 +9619
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
mdaigle
previously approved these changes
Mar 24, 2026
e7dd518
src/Microsoft.Data.SqlClient/tests/UnitTests/SimulatedServerTests/ConnectionFailoverTests.cs
Show resolved
Hide resolved
paulmedynski
approved these changes
Mar 25, 2026
mdaigle
approved these changes
Mar 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[Attempted fix with Copilot]
The Problem
When
SqlConnection.Open()runs on .NET Framework (net462), the internalLoginNoFailovermethod enters parallel/interval-timer mode by default because TNIR (Transparent Network IP Resolution) is enabled.LoginWithFailoveralways uses interval timers on all platforms.These interval timers give each login attempt only a small fraction of the total
ConnectTimeout:LoginWithFailoverFailoverTimeoutStep)LoginNoFailover+ TNIRFailoverTimeoutStepForTnir)On a busy Windows CI machine, when one of these interval timers expires mid-login, the client disconnects and retries internally inside the login loop — before the outer transient-fault retry loop ever sees the error. This produces an extra
PreLoginon the TDS server that never gets a correspondingLogin7(the client abandoned the connection).Normal flow (2 PreLogins):
Busy CI flow (3 PreLogins):
The tests asserted
Assert.Equal(2, server.PreLoginCount)— an exact count that doesn't account for these platform-specific internal retries. On .NET (net8.0/net9.0), TNIR doesn't exist andLoginNoFailoveronly uses interval timers whenMultiSubnetFailover=true, so the tests pass reliably there.The fix adds a
Login7Countcounter to the test TDS server so we can computeAbandonedPreLoginCount(PreLogins with no Login7), then asserts onPreLoginCount - AbandonedPreLoginCount— giving an exact, deterministic count regardless of how many internal timeout retries occurred.Additional fixes
[xUnit.net 00:00:03.53] UnitTests: Skipping test case with duplicate ID '287fad04d43fa7e5d02c3f11743f0138e5e6db40' ('Microsoft.Data.Common.UnitTests.MultipartIdentifierTests.SinglePartIdentifierWithBracketsAndWhiteSpace_ParsesSuccessfully(partIdentifier: "word1", expected: ["word1"])' and 'Microsoft.Data.Common.UnitTests.MultipartIdentifierTests.SinglePartIdentifierWithBracketsAndWhiteSpace_ParsesSuccessfully(partIdentifier: "word1", expected: ["word1"])')Fixes failover port test failure, by ensuring connection pools are cleared before connecting to failover partner.
Fixes SqlTypeWorkaroundsTests discovery by setting
DisableDiscoveryEnumerationto true.