Protect first C# diagnostic computation from cancellation to fix bimodal allocation in speedometer#13011
Draft
Protect first C# diagnostic computation from cancellation to fix bimodal allocation in speedometer#13011
Conversation
…dal allocation in speedometer The RazorEditingTests.CompletionInCohostingForComponents speedometer test exhibits bimodal CLR_BytesAllocated_NonDevenv behavior for the html completion scenario. This behavior has existed since the creation of the test. Most speedometer sessions don't experience any bad runs, but sometimes 1 or 2 bad runs occur where there is a marked increase in non-devenv allocations, and seemingly sdk insertions hit this hard, with all five runs experiencing the poor allocation behavior. Inspection of the traces leads to a large number of operation cancelled exceptions in the bad traces, leading to partially completed diagnostic requests to be tossed. AI indicated that Roslyn's IncrementalMemberEditAnalyzer has a heavy performance cost until it's completed once for a file. With the test hammering away typing and committing completions, the diagnostic requests get cancelled before any can complete. *** AI's description of this change *** The root cause is a bistable race condition in Roslyn's IncrementalMemberEditAnalyzer. This analyzer caches the last successfully analyzed document in a single WeakReference<Document?> field. When cached, subsequent analyses are incremental (~50ms). When not cached, every analysis is a full recomputation (~1-3s). The problem: full analyses are slow enough to be cancelled by the next incoming diagnostic request, preventing the cache from ever being set — a self-reinforcing failure mode. Razor bypasses Roslyn's pull diagnostics infrastructure (which has its own protections) and calls DiagnosticAnalyzerService.GetDiagnosticsForSpanAsync directly with the caller's cancellation token. The VS LSP client cancels in-flight diagnostic requests whenever a new request is issued, creating the rapid cancellation cycle. The fix uses AsyncLazy to ensure the first diagnostic computation per document runs with CancellationToken.None, allowing it to complete and bootstrap the analyzer. Subsequent requests pass through with normal cancellation semantics since the analyzer is now fast. A double-checked locking pattern with a volatile field guards the single-entry bootstrap cache. The initiator returns the bootstrap result directly; concurrent callers await the shared computation then make their own fresh call. On cancellation, the bootstrap is preserved (the factory continues running). On fault, it is cleared under a ReferenceEquals identity guard to avoid clobbering a different document's bootstrap.
Contributor
Author
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
Author
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
*** In draft mode until I get several speedometer runs indicating this addresses the issue ***
The RazorEditingTests.CompletionInCohostingForComponents speedometer test exhibits bimodal CLR_BytesAllocated_NonDevenv behavior for the html completion scenario. This behavior has existed since the creation of the test. Most speedometer sessions don't experience any bad runs, but sometimes 1 or 2 bad runs occur where there is a marked increase in non-devenv allocations SDK insertions seemingly hit this hard, with all five runs experiencing the poor allocation behavior.
Inspection of the traces leads to a large number of operation cancelled exceptions in the bad traces, leading to partially completed diagnostic requests to be tossed. AI indicated that Roslyn's IncrementalMemberEditAnalyzer has a heavy performance cost until it's completed once for a file. With the test hammering away typing and committing completions, the diagnostic requests get cancelled before any can complete.
*** AI's description of this change ***
The root cause is a bistable race condition in Roslyn's IncrementalMemberEditAnalyzer. This analyzer caches the last successfully analyzed document in a single WeakReference<Document?> field. When cached, subsequent analyses are incremental (~50ms). When not cached, every analysis is a full recomputation (~1-3s). The problem: full analyses are slow enough to be cancelled by the next incoming diagnostic request, preventing the cache from ever being set — a self-reinforcing failure mode.
Razor bypasses Roslyn's pull diagnostics infrastructure (which has its own protections) and calls DiagnosticAnalyzerService.GetDiagnosticsForSpanAsync directly with the caller's cancellation token. The VS LSP client cancels in-flight diagnostic requests whenever a new request is issued, creating the rapid cancellation cycle.
The fix uses AsyncLazy to ensure the first diagnostic computation per document runs with CancellationToken.None, allowing it to complete and bootstrap the analyzer. Subsequent requests pass through with normal cancellation semantics since the analyzer is now fast. A double-checked locking pattern with a volatile field guards the single-entry bootstrap cache. The initiator returns the bootstrap result directly; concurrent callers await the shared computation then make their own fresh call. On cancellation, the bootstrap is preserved (the factory continues running). On fault, it is cleared under a ReferenceEquals identity guard to avoid clobbering a different document's bootstrap.
*** Speedometer graph from the last several days of runs ***
