Skip to content

CORS-3997: azure: update default instance types from v3 to v5#10565

Open
sdodson wants to merge 1 commit into
openshift:mainfrom
sdodson:CORS-3997-azure-default-instance-v5
Open

CORS-3997: azure: update default instance types from v3 to v5#10565
sdodson wants to merge 1 commit into
openshift:mainfrom
sdodson:CORS-3997-azure-default-instance-v5

Conversation

@sdodson
Copy link
Copy Markdown
Member

@sdodson sdodson commented May 20, 2026

Summary

  • Updates Azure default control plane instance type from Standard_D8s_v3 to Standard_D8s_v5
  • Updates Azure default compute instance type from Standard_D4s_v3 to Standard_D4s_v5
  • Dv5 instances offer the same specs at comparable pricing with wider availability

Fixes: https://redhat.atlassian.net/browse/CORS-3997

Test plan

  • Verify ControlPlaneInstanceType() returns Standard_D8s_v5 for x64 in public cloud regions
  • Verify ComputeInstanceType() returns Standard_D4s_v5 for x64 in public cloud regions
  • Confirm ARM64 defaults (D8ps_v5, D4ps_v5) are unchanged
  • Confirm Azure Stack Hub defaults (DS4_v2, DS3_v2) are unchanged
  • Run hack/go-test.sh to verify no test regressions

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores
    • Updated default Azure VM instance sizes for control plane and compute to newer-generation Dv5 SKUs; ARM64 and StackCloud defaults unchanged.
  • Documentation
    • Updated Azure limits and instance sizing docs to reflect Dv5 SKUs and their resource/throughput specifications.
  • Chores
    • Updated Azure templates to use Dv5 SKUs as new defaults for bootstrap and worker VMs.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 515aaa92-db34-41e0-b4e3-61410e9c7994

📥 Commits

Reviewing files that changed from the base of the PR and between 3e91dce and a60dccd.

📒 Files selected for processing (4)
  • docs/user/azure/limits.md
  • pkg/types/azure/defaults/machines.go
  • upi/azure/04_bootstrap.json
  • upi/azure/06_workers.json

Walkthrough

Defaults for Azure x86 VM SKUs were bumped from Dv3 to Dv5: control plane default now D8s_v5, compute default now D4s_v5. The change also updates UPI ARM template defaults and the Azure user limits documentation; ARM64 and StackCloud overrides unchanged.

Changes

Azure Default Instance Types

Layer / File(s) Summary
Control plane and compute defaults + UPI templates + docs
pkg/types/azure/defaults/machines.go, upi/azure/04_bootstrap.json, upi/azure/06_workers.json, docs/user/azure/limits.md
ControlPlaneInstanceType now defaults to D8s_v5 (was D8s_v3); ComputeInstanceType now defaults to D4s_v5 (was D4s_v3). Updated UPI template parameters (bootstrapVMSize, nodeVMSize) and documentation entries to use Dv5-series SKUs.

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: updating Azure default instance types from v3 to v5 series across multiple components.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed This PR contains no test files. The custom check for stable test names applies only to Ginkgo tests. The PR modifies Go source code, documentation, and JSON ARM templates only.
Test Structure And Quality ✅ Passed PR does not include Ginkgo test code. Codebase does not use Ginkgo framework (not in go.mod, no Ginkgo syntax found). Check is not applicable to this PR.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests were added in this PR. The check is not applicable as it specifically targets Ginkgo test additions. The PR updates Azure default instance types and documentation only.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests added. PR modifies Azure default VM instance types, documentation, and ARM templates only.
Topology-Aware Scheduling Compatibility ✅ Passed PR updates only Azure VM instance types (Dv3→Dv5 SKUs) in infrastructure configuration. No pod scheduling constraints, affinity rules, or topology-dependent logic introduced.
Ote Binary Stdout Contract ✅ Passed PR updates only Azure configuration defaults (D8s_v3→D8s_v5, D4s_v3→D4s_v5) in library code, docs, and ARM templates with no process-level entry points or stdout writes.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR updates Azure VM instance defaults (v3 to v5) in configuration files and documentation only. No new Ginkgo e2e tests are added, so the IPv6/disconnected network compatibility check does not apply.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@sdodson sdodson changed the title azure: update default instance types from v3 to v5 CORS-3997: azure: update default instance types from v3 to v5 May 20, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 20, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 20, 2026

@sdodson: This pull request references CORS-3997 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Updates Azure default control plane instance type from Standard_D8s_v3 to Standard_D8s_v5
  • Updates Azure default compute instance type from Standard_D4s_v3 to Standard_D4s_v5
  • Dv5 instances offer the same specs at comparable pricing with wider availability

Fixes: https://redhat.atlassian.net/browse/CORS-3997

Test plan

  • Verify ControlPlaneInstanceType() returns Standard_D8s_v5 for x64 in public cloud regions
  • Verify ComputeInstanceType() returns Standard_D4s_v5 for x64 in public cloud regions
  • Confirm ARM64 defaults (D8ps_v5, D4ps_v5) are unchanged
  • Confirm Azure Stack Hub defaults (DS4_v2, DS3_v2) are unchanged
  • Run hack/go-test.sh to verify no test regressions

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot requested review from rna-afk and sadasu May 20, 2026 13:57
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/types/azure/defaults/machines.go (2)

29-41: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update ARM templates and documentation to use Standard_D4s_v5 to match the code change.

While Standard_D4s_v5 exists with the specified 4 vCPU and 16 GB RAM, the default instance type change is incomplete. The following files still hardcode Standard_D4s_v3:

  • upi/azure/06_workers.json (line 43)
  • upi/azure/04_bootstrap.json (line 35)
  • docs/user/azure/limits.md (lines 47, 61)
  • pkg/asset/installconfig/azure/validation_test.go (lines 44, 75, 77, 1501, 1525)

These must be updated to maintain consistency with the new default.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/types/azure/defaults/machines.go` around lines 29 - 41, The default
compute size was changed in ComputeInstanceType (size "D4s_v5" / "D4ps_v5" for
ARM64 and fallback to "DS3_v2" for azure.StackCloud) but JSON templates and docs
still reference Standard_D4s_v3; update every occurrence of Standard_D4s_v3 to
Standard_D4s_v5 (or Standard_D4ps_v5 for ARM64 where applicable) in ARM
templates, bootstrap/worker templates, documentation, and validation tests so
they match the logic used by ComputeInstanceType and the
instanceType/getInstanceClass behavior; run/update the affected
validation_test.go expectations to reflect the new default.

12-24: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update hardcoded references to maintain consistency across templates, tests, and documentation.

Standard_D8s_v5 exists and has the claimed specifications (8 vCPUs, 32 GB RAM). However, hardcoded references to the old Standard_D8s_v3 type remain in non-vendor files that should be updated to match the new defaults:

  • upi/azure/05_masters.json (line 50): defaultValue still references "Standard_D8s_v3"
  • pkg/asset/installconfig/azure/validation_test.go (multiple lines): test fixtures reference the old type
  • docs/user/azure/limits.md (lines 48, 53): documentation mentions Standard_D8s_v3 master nodes
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/types/azure/defaults/machines.go` around lines 12 - 24, The templates,
tests, and docs still reference the old Standard_D8s_v3 while
ControlPlaneInstanceType now sets size to "D8s_v5" (and "D8ps_v5" for ARM64,
"DS4_v2" for StackCloud); update all hardcoded references to the old instance
type to match the new default (use Standard_D8s_v5 or the equivalent "D8s_v5"
naming used by ControlPlaneInstanceType) so UPI templates
(upi/azure/05_masters.json), test fixtures
(pkg/asset/installconfig/azure/validation_test.go), and docs
(docs/user/azure/limits.md) consistently reference the same instance type and
variants for ARM64/StackCloud.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/types/azure/defaults/machines.go`:
- Around line 29-41: The default compute size was changed in ComputeInstanceType
(size "D4s_v5" / "D4ps_v5" for ARM64 and fallback to "DS3_v2" for
azure.StackCloud) but JSON templates and docs still reference Standard_D4s_v3;
update every occurrence of Standard_D4s_v3 to Standard_D4s_v5 (or
Standard_D4ps_v5 for ARM64 where applicable) in ARM templates, bootstrap/worker
templates, documentation, and validation tests so they match the logic used by
ComputeInstanceType and the instanceType/getInstanceClass behavior; run/update
the affected validation_test.go expectations to reflect the new default.
- Around line 12-24: The templates, tests, and docs still reference the old
Standard_D8s_v3 while ControlPlaneInstanceType now sets size to "D8s_v5" (and
"D8ps_v5" for ARM64, "DS4_v2" for StackCloud); update all hardcoded references
to the old instance type to match the new default (use Standard_D8s_v5 or the
equivalent "D8s_v5" naming used by ControlPlaneInstanceType) so UPI templates
(upi/azure/05_masters.json), test fixtures
(pkg/asset/installconfig/azure/validation_test.go), and docs
(docs/user/azure/limits.md) consistently reference the same instance type and
variants for ARM64/StackCloud.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c91042e9-eb4e-4523-8080-16384d1dcbf4

📥 Commits

Reviewing files that changed from the base of the PR and between 9c39356 and 3e91dce.

📒 Files selected for processing (1)
  • pkg/types/azure/defaults/machines.go

The Dv3 series is an older generation; Dv5 instances offer the same
specs at comparable pricing and are more widely available. Update
control plane default from D8s_v3 to D8s_v5 and compute default
from D4s_v3 to D4s_v5.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sdodson sdodson force-pushed the CORS-3997-azure-default-instance-v5 branch from 3e91dce to a60dccd Compare May 20, 2026 16:39
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 20, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign rna-afk for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rna-afk
Copy link
Copy Markdown
Contributor

rna-afk commented May 20, 2026

/retest-required

func ControlPlaneInstanceType(cloud azure.CloudEnvironment, region string, arch types.Architecture) string {
instanceClass := getInstanceClass(region)
size := "D8s_v3"
size := "D8s_v5"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In conversation, we said our CI quota was for Dasv5 machines, but this is Dsv5, so it should be:

Suggested change
size := "D8s_v5"
size := "D8as_v5"

and similarly D4s_v5 -> D4as_v5

The difference is Dsv5 is Intel CPUs, and Dasv5 is AMD. Dasv5 is generally preferred unless you require intel for some reason (in which case users should specify).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dasv5 is generally preferred unless you require intel for some reason (in which case users should specify).

@patrickdillon You mean in general or based on our CI specific quota config?

I dug into this more and have found that if we use v6 instances we should achieve basically the same I/O performance on 4 vCPU instances as we do 8 vCPU v3 which would bring Azure inline with the defaults on GCP and AWS. We were oversizing control plane defaults to achieve I/O performance not for CPU / Memory reasons AFAIK.

openshift/release#75999 includes full details, we may or may not have to request quota changes given we'll be splitting workers and control plane across quota classes and we'll be cutting control plane in half.

This probably warrants catching up synchronously.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdodson Oh yes, to clarify I was only thinking about AMD vs Intel; and my point was that AMD seemed to be what we have available in CI (but this PR is using Intel) AND that AMD is generally preferred because it provides better price to performance.

If Dasv6 series is available in enough supply, as openshift/release#75999 reports I agree this is a great choice.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 20, 2026

@sdodson: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azurestack-upi a60dccd link false /test e2e-azurestack-upi
ci/prow/e2e-azurestack a60dccd link false /test e2e-azurestack
ci/prow/e2e-azure-ovn-upi a60dccd link false /test e2e-azure-ovn-upi
ci/prow/e2e-azure-ovn-shared-vpc a60dccd link false /test e2e-azure-ovn-shared-vpc
ci/prow/e2e-azure-default-config a60dccd link false /test e2e-azure-default-config

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants