Skip to content

Fix domain parsing for GPU & add Display controller in the supported PCI class#12981

Open
vishesh92 wants to merge 6 commits intoapache:4.22from
shapeblue:fix-domain-in-gpu
Open

Fix domain parsing for GPU & add Display controller in the supported PCI class#12981
vishesh92 wants to merge 6 commits intoapache:4.22from
shapeblue:fix-domain-in-gpu

Conversation

@vishesh92
Copy link
Copy Markdown
Member

@vishesh92 vishesh92 commented Apr 8, 2026

Description

This PR fixes #12960 & #12957

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses CloudStack issue #12960 where GPU passthrough VM creation can fail on KVM due to incorrect parsing of PCI addresses (notably when the domain portion is present/expanded), leading to invalid libvirt PCI slot values.

Changes:

  • Update KVM GPU discovery script to consistently normalize/handle PCI addresses with and without an explicit domain and to key lookups using the domain-qualified form.
  • Update LibvirtGpuDef PCI XML generation to accept both bb:ss.f and dddd:bb:ss.f bus address formats.
  • Add unit tests covering full PCI addresses (domain 0, non-zero domain) and legacy short BDF behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
scripts/vm/hypervisor/kvm/gpudiscovery.sh Normalizes PCI address handling (domain-aware) across sysfs access, cache keys, and VM usage mapping to avoid wrong BDF parsing.
plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtGpuDef.java Extends PCI address parsing to support domain:bus:slot.func input, preventing slot mis-parsing when a domain is present.
plugins/hypervisors/kvm/src/test/java/com/cloud/hypervisor/kvm/resource/LibvirtGpuDefTest.java Adds regression/unit tests for full PCI addresses and backward compatibility for short BDF inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 17.60%. Comparing base (abdf926) to head (4e7a834).
⚠️ Report is 1 commits behind head on 4.22.

Additional details and impacted files
@@            Coverage Diff            @@
##               4.22   #12981   +/-   ##
=========================================
  Coverage     17.60%   17.60%           
- Complexity    15673    15680    +7     
=========================================
  Files          5918     5918           
  Lines        531681   531739   +58     
  Branches      65005    65016   +11     
=========================================
+ Hits          93589    93636   +47     
- Misses       427539   427545    +6     
- Partials      10553    10558    +5     
Flag Coverage Δ
uitests 3.70% <ø> (ø)
unittests 18.68% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

vishesh92 and others added 3 commits April 8, 2026 14:32
this adds support for the amd instinct mi2xx accelorator crards in the discovery script.
@vishesh92 vishesh92 changed the title Fix domain parsing for GPU Fix domain parsing for GPU & add Display controller in the supported PCI class Apr 8, 2026
@vishesh92 vishesh92 requested a review from Copilot April 8, 2026 09:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@vishesh92 vishesh92 requested a review from Copilot April 8, 2026 10:12
@vishesh92
Copy link
Copy Markdown
Member Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@vishesh92 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17398

@vishesh92 vishesh92 requested a review from Copilot April 8, 2026 11:40
@vishesh92
Copy link
Copy Markdown
Member Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@vishesh92 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17399

@vishesh92
Copy link
Copy Markdown
Member Author

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@vishesh92 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

CloudStack 4.22 – GPU passthrough VM creation fails due to invalid PCI slot mapping GPU Discovery not detecting AMD Mi2xx GPUs

6 participants