CORS-3997: azure: update default instance types from v3 to v5#10565
CORS-3997: azure: update default instance types from v3 to v5#10565sdodson wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (4)
WalkthroughDefaults for Azure x86 VM SKUs were bumped from Dv3 to Dv5: control plane default now D8s_v5, compute default now D4s_v5. The change also updates UPI ARM template defaults and the Azure user limits documentation; ARM64 and StackCloud overrides unchanged. ChangesAzure Default Instance Types
🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
@sdodson: This pull request references CORS-3997 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/types/azure/defaults/machines.go (2)
29-41:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUpdate ARM templates and documentation to use Standard_D4s_v5 to match the code change.
While
Standard_D4s_v5exists with the specified 4 vCPU and 16 GB RAM, the default instance type change is incomplete. The following files still hardcodeStandard_D4s_v3:
upi/azure/06_workers.json(line 43)upi/azure/04_bootstrap.json(line 35)docs/user/azure/limits.md(lines 47, 61)pkg/asset/installconfig/azure/validation_test.go(lines 44, 75, 77, 1501, 1525)These must be updated to maintain consistency with the new default.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/types/azure/defaults/machines.go` around lines 29 - 41, The default compute size was changed in ComputeInstanceType (size "D4s_v5" / "D4ps_v5" for ARM64 and fallback to "DS3_v2" for azure.StackCloud) but JSON templates and docs still reference Standard_D4s_v3; update every occurrence of Standard_D4s_v3 to Standard_D4s_v5 (or Standard_D4ps_v5 for ARM64 where applicable) in ARM templates, bootstrap/worker templates, documentation, and validation tests so they match the logic used by ComputeInstanceType and the instanceType/getInstanceClass behavior; run/update the affected validation_test.go expectations to reflect the new default.
12-24:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUpdate hardcoded references to maintain consistency across templates, tests, and documentation.
Standard_D8s_v5 exists and has the claimed specifications (8 vCPUs, 32 GB RAM). However, hardcoded references to the old Standard_D8s_v3 type remain in non-vendor files that should be updated to match the new defaults:
upi/azure/05_masters.json(line 50): defaultValue still references "Standard_D8s_v3"pkg/asset/installconfig/azure/validation_test.go(multiple lines): test fixtures reference the old typedocs/user/azure/limits.md(lines 48, 53): documentation mentions Standard_D8s_v3 master nodes🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/types/azure/defaults/machines.go` around lines 12 - 24, The templates, tests, and docs still reference the old Standard_D8s_v3 while ControlPlaneInstanceType now sets size to "D8s_v5" (and "D8ps_v5" for ARM64, "DS4_v2" for StackCloud); update all hardcoded references to the old instance type to match the new default (use Standard_D8s_v5 or the equivalent "D8s_v5" naming used by ControlPlaneInstanceType) so UPI templates (upi/azure/05_masters.json), test fixtures (pkg/asset/installconfig/azure/validation_test.go), and docs (docs/user/azure/limits.md) consistently reference the same instance type and variants for ARM64/StackCloud.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@pkg/types/azure/defaults/machines.go`:
- Around line 29-41: The default compute size was changed in ComputeInstanceType
(size "D4s_v5" / "D4ps_v5" for ARM64 and fallback to "DS3_v2" for
azure.StackCloud) but JSON templates and docs still reference Standard_D4s_v3;
update every occurrence of Standard_D4s_v3 to Standard_D4s_v5 (or
Standard_D4ps_v5 for ARM64 where applicable) in ARM templates, bootstrap/worker
templates, documentation, and validation tests so they match the logic used by
ComputeInstanceType and the instanceType/getInstanceClass behavior; run/update
the affected validation_test.go expectations to reflect the new default.
- Around line 12-24: The templates, tests, and docs still reference the old
Standard_D8s_v3 while ControlPlaneInstanceType now sets size to "D8s_v5" (and
"D8ps_v5" for ARM64, "DS4_v2" for StackCloud); update all hardcoded references
to the old instance type to match the new default (use Standard_D8s_v5 or the
equivalent "D8s_v5" naming used by ControlPlaneInstanceType) so UPI templates
(upi/azure/05_masters.json), test fixtures
(pkg/asset/installconfig/azure/validation_test.go), and docs
(docs/user/azure/limits.md) consistently reference the same instance type and
variants for ARM64/StackCloud.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: c91042e9-eb4e-4523-8080-16384d1dcbf4
📒 Files selected for processing (1)
pkg/types/azure/defaults/machines.go
The Dv3 series is an older generation; Dv5 instances offer the same specs at comparable pricing and are more widely available. Update control plane default from D8s_v3 to D8s_v5 and compute default from D4s_v3 to D4s_v5. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3e91dce to
a60dccd
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retest-required |
| func ControlPlaneInstanceType(cloud azure.CloudEnvironment, region string, arch types.Architecture) string { | ||
| instanceClass := getInstanceClass(region) | ||
| size := "D8s_v3" | ||
| size := "D8s_v5" |
There was a problem hiding this comment.
In conversation, we said our CI quota was for Dasv5 machines, but this is Dsv5, so it should be:
| size := "D8s_v5" | |
| size := "D8as_v5" |
and similarly D4s_v5 -> D4as_v5
The difference is Dsv5 is Intel CPUs, and Dasv5 is AMD. Dasv5 is generally preferred unless you require intel for some reason (in which case users should specify).
There was a problem hiding this comment.
Dasv5 is generally preferred unless you require intel for some reason (in which case users should specify).
@patrickdillon You mean in general or based on our CI specific quota config?
I dug into this more and have found that if we use v6 instances we should achieve basically the same I/O performance on 4 vCPU instances as we do 8 vCPU v3 which would bring Azure inline with the defaults on GCP and AWS. We were oversizing control plane defaults to achieve I/O performance not for CPU / Memory reasons AFAIK.
openshift/release#75999 includes full details, we may or may not have to request quota changes given we'll be splitting workers and control plane across quota classes and we'll be cutting control plane in half.
This probably warrants catching up synchronously.
There was a problem hiding this comment.
@sdodson Oh yes, to clarify I was only thinking about AMD vs Intel; and my point was that AMD seemed to be what we have available in CI (but this PR is using Intel) AND that AMD is generally preferred because it provides better price to performance.
If Dasv6 series is available in enough supply, as openshift/release#75999 reports I agree this is a great choice.
|
@sdodson: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Standard_D8s_v3toStandard_D8s_v5Standard_D4s_v3toStandard_D4s_v5Fixes: https://redhat.atlassian.net/browse/CORS-3997
Test plan
ControlPlaneInstanceType()returnsStandard_D8s_v5for x64 in public cloud regionsComputeInstanceType()returnsStandard_D4s_v5for x64 in public cloud regionsD8ps_v5,D4ps_v5) are unchangedDS4_v2,DS3_v2) are unchangedhack/go-test.shto verify no test regressions🤖 Generated with Claude Code
Summary by CodeRabbit