chore(deps): update nvidia-dcgm (patch)#8659
Conversation
1c34d5c to
51e8bdd
Compare
|
AgentBaker Linux PR gate — E2E failure (mixed: 3 leaves shared infra; 1 ACL leaf likely on main)
Group A — shared infra/test-fixture issue, NOT this PR (3 leaves):
Group B — ACL FIPS TL leaf, very likely existing main regression (2 leaves):
Confidence: HIGH that this PR is not the cause of either failure group. Recommended next action:
Strongest alternative (less likely): transient ACR-private-endpoint outage for Group A + intermittent ACL firewall rule timing for Group B — refuted because each pattern is now reproducing deterministically on every recent PR build. Posted by Clawpilot AgentBaker gate detective. |
51e8bdd to
f4fe0e0
Compare
f4fe0e0 to
6310c4d
Compare
6310c4d to
d445cbf
Compare
|
AgentBaker Linux PR gate — 236-failure run: shared cluster fleet outage continues (test-infra, NOT this PR)
Same shared cluster fleet outage affecting every concurrent PR in this window: 123× Cross-PR pattern this morning: PR #8652 build 167419663, PR #8679 build 167421198, PR #8294 build 167422687, and concurrent PRs all hit identical 236-fail / cluster-not-ready signature. Build-vs-test: test-infra (shared cluster fleet outage), NOT product, NOT PR-caused. Recommended next action / owner: Posted by Clawpilot AgentBaker gate detective. |
d445cbf to
313d630
Compare
|
AgentBaker Linux PR gate — 3 distinct E2E failures, all test-infra / shared-cluster (NOT this PR)
Detective summary — two independent signatures (1) The validator asserts that the node's iptables rule blocks egress to (2) The CSE retries Classification: Test infrastructure / shared-cluster issues. Neither failure is reachable from changes in PR #8659 (renovate nvidia-dcgm patch — does not touch wireserver/iptables, CSE, apt sources, or the proxy fixture). Confidence: High for both (multiple subtests, identical signatures, no PR-relevant changed files, all VHD builds passed). Strongest alternative theory: Recent change to Recommended next action / owner:
Evidence used: failed task log (5 |
551d5fc to
17cf3b4
Compare
17cf3b4 to
9d38eb2
Compare
AgentBaker Linux gate detectiveRun: https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=168659199 RCA: This is the recurring NetworkIsolated shared-cluster cleanup flake. E2E found existing cluster �be2e-azure-networkisolated-v3-d6cc9 already Failed/Deleting and failed before PR-specific scenario validation. Prior evidence for this same cluster shows ResourceGroupDeletionBlocked / InUseNetworkSecurityGroupCannotBeDeleted on �be2e-networkisolated-securityGroup still attached to the shared VNET subnet. Confidence: High. Corroborated by timeline/status, Run AgentBaker E2E log 538, unrelated PRs hitting the same cluster/signature, and the existing repair item #38506740. Strongest alternative: the nvidia-dcgm Renovate payload caused E2E provisioning; less likely because the failure occurs while acquiring a pre-existing shared NetworkIsolated cluster already in bad lifecycle state. Recommended owner/action: AgentBaker E2E test-infra owner: clean/quarantine �be2e-azure-networkisolated-v3-d6cc9, remove the stuck NSG/subnet association, and harden shared-cluster cleanup/retry. Wiki signature: |
AgentBaker Linux gate detectiveRun: https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=168575141 RCA: E2E infra failure: multiple shared clusters were already Failed/Deleting and cleanup was blocked by ResourceGroupDeletionBlocked / InUseRouteTableCannotBeDeleted on route table abe2e-fw-rt. This is not caused by the nvidia-dcgm Renovate change. Confidence: High for the primary signature. Corroborated by timeline/status, focused failed logs, associated changes, and the flakiness wiki before publishing. Strongest alternative: nvidia-dcgm package update caused E2E failure; less likely because the failure is pre-validation shared-cluster lifecycle state across unrelated clusters. Recommended owner/action: AgentBaker E2E test-infra: clean stale shared clusters and unblock route table dependencies. Wiki signature: $(System.Collections.Hashtable.sig) (source of truth) |
9d38eb2 to
616997f
Compare
AgentBaker Linux gate detectiveRun: https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=168775762 |
AgentBaker Linux gate detectiveRun: https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=168796845 |
616997f to
d56e0a5
Compare
d56e0a5 to
f8308ed
Compare
f8308ed to
a1fb73b
Compare
a1fb73b to
650c82e
Compare
AgentBaker Linux gate detectiveRun: https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=168928859 |
AgentBaker Linux gate detectiveRun: https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=168921966 |
AgentBaker Linux gate detectiveRun: https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=168889354 |
AgentBaker Linux gate detectiveRun: https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=168868981 |
650c82e to
ca9b5cc
Compare
|
@C:\Users\SYLVAI~1\AppData\Local\Temp\ab-gate-pr8659-169034678-comment.md |
ca9b5cc to
cbbdcc7
Compare
|
@C:\Users\SYLVAI~1\AppData\Local\Temp\ab-gate-pr8659-169061016-comment.md |
This PR contains the following updates:
4.8.2-1.azl3→4.8.2-3.azl34.8.2-ubuntu24.04u1→4.8.2-ubuntu24.04u34.8.2-ubuntu22.04u1→4.8.2-ubuntu22.04u3Warning
Some dependencies could not be looked up. Check the Dependency Dashboard for more information.
Configuration
📅 Schedule: (UTC)
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about these updates again.
This PR was generated by Mend Renovate. View the repository job log.