Fix stress pipeline: rename variables to avoid env var injection, simplify runner#4329
Fix stress pipeline: rename variables to avoid env var injection, simplify runner#4329paulmedynski wants to merge 10 commits into
Conversation
… warnOnTestFailure - Move STRESS_CONFIG_FILE from dotnet run -e arg to task env: property to fix DotNetCoreCLI@2 leaking the env var value as an app argument, which hit the default switch case and set RunMode.Help. - Rename failOnTestFailure to warnOnTestFailure (default false) with inverted semantics: false = failures fail the run, true = warnings only. All test steps always execute regardless of this setting.
There was a problem hiding this comment.
Pull request overview
This PR fixes Azure DevOps stress test runs accidentally executing in “Help” mode by moving STRESS_CONFIG_FILE injection from dotnet run -e ... into the supported env: block for DotNetCoreCLI@2. It also renames the failure-handling parameter to warnOnTestFailure and aligns continueOnError semantics with that name.
Changes:
- Set
STRESS_CONFIG_FILEviaenv:on theDotNetCoreCLI@2runtasks (instead ofdotnet run -e ...) to prevent argument leakage into the runner. - Replace
failOnTestFailurewithwarnOnTestFailure(defaultfalse) and updatecontinueOnErroraccordingly. - Add
--no-buildto sharedrunArgumentssodotnet runconsistently uses the already-built outputs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| eng/pipelines/stress/stress-tests-stage.yml | Renames and threads the new warnOnTestFailure parameter into the stage’s OS-specific jobs. |
| eng/pipelines/stress/stress-tests-pipeline.yml | Exposes warnOnTestFailure as a pipeline parameter and passes it to the stress test stage. |
| eng/pipelines/stress/stress-tests-job.yml | Applies warnOnTestFailure to continueOnError and sets STRESS_CONFIG_FILE via task env: for each runtime run. |
- Remove RunMode enum, RunVerify, ExitWithError, and Help mode - Init() returns exit code directly; Main() short-circuits on non-zero - Rename RunStress() to Run() (only remaining path) - Fix GetMdsVersion() to use correct namespace (Microsoft.Data.SqlClient.ThisAssembly) - PrintHelp() dumps received args with quoting for diagnostics
The task's 'command: run' mode puts all 'arguments' content after '--' as application args, ignoring the user-specified '--' separator. This caused dotnet-run flags (--verbosity, --configuration, --no-build, -f) to be passed to the stress runner app instead of to dotnet-run itself. Switch to 'command: custom' + 'custom: run' which passes arguments literally, giving us full control over placement.
1b955b5 to
09a1d25
Compare
…gression dotnet run in SDK 10.0.300 incorrectly forwards its own options (--verbosity, --configuration, --no-build) as application arguments on the CI agent (Ubuntu 22.04). This cannot be reproduced locally with the same SDK version but manifests consistently on hosted agents. Since we build separately anyway (--no-build), switch to dotnet exec which directly invokes the built DLL without any argument parsing ambiguity.
- Remove stale -all and -verify options from PrintHelp (no run modes remain) - Clarify netTestRuntimes comment to note net10.0 is tested but not shipped - Use CmdLine@2 for .NET Framework test loop (dotnet exec cannot run net462 exe)
The dotnet CLI's System.CommandLine reads environment variables named
{COMMAND}ARGUMENTS (e.g. RUNARGUMENTS, BUILDARGUMENTS, TESTARGUMENTS)
and silently injects their content into parsed arguments. Since ADO
exposes all pipeline variables as uppercased env vars, variables named
'runArguments', 'buildArguments', or 'testArguments' caused the stress
runner to receive SDK options in its args[], triggering Help mode.
Changes:
- Rename buildArguments -> dotnetBuildOpts
- Rename testArguments -> stressTestOpts
- Add dotnetRunOpts variable for shared dotnet run options
- Revert dotnet exec back to dotnet run (the real fix is the rename)
- Use ${{ variables.* }} for compile-time expansion
- Document the gotcha in ado-pipelines.instructions.md
- Upgraded the Runner to use System.CommandLine instead of trying to parse arguments by hand.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 53 out of 53 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/Microsoft.Data.SqlClient/tests/StressTests/SqlClient.Stress.Runner/Monitor/MonitorLoadUtils.cs:22
mainAssembly.GetType(classname)can return null (and in the current layout it will, sincemainAssemblyis the Monitoring assembly that containsIMonitorLoader, but it does not defineMicrosoft.Data.SqlClient.Test.Stress.MonitorLoader). The subsequentt.GetInterfaces()will throw aNullReferenceException, and enabling monitoring will fail with an unhelpful crash. Add an explicit null check with a clear error so failures are actionable (and ensure the expectedIMonitorLoaderimplementation type exists in the target assembly).
|
|
||
| ## Variable Naming — Avoid `{COMMAND}ARGUMENTS` Names | ||
|
|
||
| The dotnet CLI (via System.CommandLine) reads environment variables named `{COMMAND}ARGUMENTS` and silently injects their content into the parsed arguments for that subcommand. Because Azure DevOps automatically exposes all pipeline variables as uppercased environment variables, a pipeline variable named `runArguments` becomes `RUNARGUMENTS`, which `dotnet run` reads and injects into the application's `args[]` — bypassing the `--` separator. |
There was a problem hiding this comment.
This took many hours to figure out.
| # True to enable debugging steps. | ||
| - name: debug | ||
| type: boolean | ||
| default: false |
There was a problem hiding this comment.
Avoiding template parameter default values except at the top level.
| # warnings but do not fail the overall pipeline run. | ||
| - name: failOnTestFailure | ||
| displayName: Fail pipeline on test failure | ||
| # When true, test failures produce warnings (SucceededWithIssues) but do not fail the pipeline. |
There was a problem hiding this comment.
This flips the default behaviour - now the entire pipeline run will fail if any tests steps fail.
| command: build | ||
| projects: $(project) | ||
| arguments: $(buildArguments) | ||
| arguments: ${{ variables.dotnetBuildOpts }} |
There was a problem hiding this comment.
Prefer to expand static variables at pipeline expansion time, rather than runtime.
| condition: and(succeededOrFailed(), eq(variables['buildSucceeded'], 'true')) | ||
| continueOnError: ${{ not(parameters.failOnTestFailure) }} | ||
| continueOnError: ${{ parameters.warnOnTestFailure }} | ||
| env: |
There was a problem hiding this comment.
This is a more robust way of injecting environment variables, rather than using the -e argument.
| # Triggering pipeline: MDS Main CI (branch internal/main only) | ||
| # Pipeline name: sqlclient-stress | ||
| # Pipeline URL: TODO: add URL when pipeline is created | ||
| # Pipeline URL: https://dev.azure.com/SqlClientDrivers/ADO.Net/_build?definitionId=2284 |
There was a problem hiding this comment.
I created a sqlclient-stress pipeline in the ADO.Net project, so this will now trigger on successful completion of the MDS Main CI pipeline runs in that project.
| - name: saPassword | ||
| value: $[stageDependencies.secrets_stage.secrets_job.outputs['SaPassword.Value']] | ||
|
|
||
| # The 1ES pool name, determined automatically by ADO project: |
There was a problem hiding this comment.
This is a superior approach to choosing pool name. We should plan to update our other cross-project pipelines, and remove the variable group $(ci_var_poolName) approach. Our pipelines need to explicitly support all target Azure DevOps projects anyway, so no harm in hardcoding their names and pools.
There was a problem hiding this comment.
Hardcoding 1ES pool or image names can be tedious to manage and the more pipelines we get - the name of images can get more widespread and lost in the files.
I would suggest we use a structured variable file (single 1ES artifact) that contains only 1ES pool and image names so if you need to identify and list all the 1ES pools and images being used in this repo, you can gather them from 1 location, or if any one image needs to replaced by something new you don't need to hunt down all pipelines but can update it from one location.
There was a problem hiding this comment.
Captured as a future work item: https://sqlclientdrivers.visualstudio.com/ADO.Net/_workitems/edit/45495
| using System.Collections.Generic; | ||
|
|
||
| namespace Monitoring | ||
| namespace Microsoft.Data.SqlClient.Test.Stress |
There was a problem hiding this comment.
I gave all of these classes a proper namespace.
| public static void Init(string[] args) | ||
| static int Main(string[] args) | ||
| { | ||
| for (int i = 0; i < args.Length; i++) |
There was a problem hiding this comment.
The manual command-line parsing contributed to problems diagnosing the root cause, so I modernized it all with System.CommandLine.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4329 +/- ##
==========================================
- Coverage 66.69% 64.94% -1.75%
==========================================
Files 284 279 -5
Lines 43238 66069 +22831
==========================================
+ Hits 28836 42908 +14072
- Misses 14402 23161 +8759
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Fixes stress tests running in Help mode on CI instead of executing tests. The root cause was the dotnet CLI's System.CommandLine silently injecting pipeline variable content into application arguments.
Root Cause
The dotnet CLI reads environment variables named
{COMMAND}ARGUMENTS(e.g.RUNARGUMENTS,BUILDARGUMENTS,TESTARGUMENTS) and prepends their content to the parsed arguments for that subcommand. Azure DevOps automatically exposes all pipeline variables as uppercased environment variables, so a pipeline variable namedbuildArgumentsbecameBUILDARGUMENTS, andtestArgumentsbecameTESTARGUMENTS— causing their content to leak into the application'sargs[]and triggering Help mode.This affects all .NET SDK versions (8.0+) and is invisible in
[command]log lines.You can see the pipeline executing the tests now:
They aren't passing, but that's a problem for another PR.
Changes
Pipeline Default Behaviour
Pipeline variable rename (
stress-tests-job.yml)buildArguments→dotnetBuildOptstestArguments→stressTestOptsdotnetRunOptsvariable for shareddotnet runoptions${{ variables.* }}compile-time expansion for all static variables{COMMAND}ARGUMENTSenv var footgunPool name auto-selection (
stress-tests-stage.yml)ADO.Net→ADO-1ES-Poolpublic→ADO-CI-1ES-Pool${{ if eq(variables['System.TeamProject'], 'ADO.Net') }}at compile timeStress runner simplification (
Program.cs)RunMode.RunOne,RunMode.Profile, XML config)PlatformAttribute(unused)RunAllandHelpmodesPipeline structure (
stress-tests-pipeline.yml,stress-tests-stage.yml)netFrameworkTestRuntimes: []for Linux/macOS jobsInstructions update (
.github/instructions/ado-pipelines.instructions.md){COMMAND}ARGUMENTSenv var injection gotchaVerification
env:property correctly setsSTRESS_CONFIG_FILEfor the spawned process