kvm: fix wrong CheckVirtualMachineAnswer when vm does not exist#12928
kvm: fix wrong CheckVirtualMachineAnswer when vm does not exist#12928sureshanaparti merged 3 commits intoapache:4.22from
Conversation
|
@blueorangutan package |
|
@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## 4.22 #12928 +/- ##
============================================
- Coverage 17.60% 17.60% -0.01%
- Complexity 15676 15678 +2
============================================
Files 5918 5918
Lines 531667 531667
Branches 65001 65001
============================================
- Hits 93617 93611 -6
- Misses 427491 427501 +10
+ Partials 10559 10555 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17307 |
|
@blueorangutan test |
|
@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
.../com/cloud/hypervisor/kvm/resource/wrapper/LibvirtCheckVirtualMachineCommandWrapperTest.java
Outdated
Show resolved
Hide resolved
…/resource/wrapper/LibvirtCheckVirtualMachineCommandWrapperTest.java Co-authored-by: dahn <daan.hoogland@gmail.com>
|
[SF] Trillian test result (tid-15787)
|
There was a problem hiding this comment.
LGTM
Tested manually
- Deployed a vm with ha offering on a kvm host1
[root@ref-trl-11549-k-Mol8-kiran-chavala-kvm1 ~]# virsh list
Id Name State
--------------------------
2 i-2-3-VM running
-
virsh destroy removes the libvirt domain.
-
CloudStack detects the VM as missing.
-
After the graceful period expires, CloudStack updates the VM power report to PowerReportMissing.
-
The VM is restarted automatically on an eligible host and the state is shown as running
[root@ref-trl-11549-k-Mol8-kiran-chavala-kvm1 ~]# virsh list
Id Name State
--------------------------
3 i-2-3-VM running
logs
2026-04-10 10:09:28,721 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] (AgentManager-Handler-12:[]) (logid:) Detected missing VM. host: 1, vm id: 3(68a0af3d-5f59-4480-95dd-4af1456d6d2f), power state: PowerReportMissing, last state update: 2026-04-10T10:07:28+0000
2026-04-10 10:09:28,731 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] (AgentManager-Handler-12:[]) (logid:) VM state report is updated. Host {"id":1,"name":"ref-trl-11549-k-Mol8-kiran-chavala-kvm1","type":"Routing","uuid":"f65d0b2f-e1f2-47f6-887d-31f26b257e1a"}, VM instance {"id":3,"instanceName":"i-2-3-VM","state":"Running","type":"User","uuid":"68a0af3d-5f59-4480-95dd-4af1456d6d2f"}, power state: PowerReportMissing
2026-04-10 10:09:28,741 INFO [c.c.v.ClusteredVirtualMachineManagerImpl] (AgentManager-Handler-12:[ctx-5542e17c]) (logid:) VM VM instance {"id":3,"instanceName":"i-2-3-VM","state":"Running","type":"User","uuid":"68a0af3d-5f59-4480-95dd-4af1456d6d2f"} is at Running and we received a PowerReportMissing report while there is no pending jobs on it
2026-04-10 10:09:28,742 INFO [c.c.v.ClusteredVirtualMachineManagerImpl] (AgentManager-Handler-12:[ctx-5542e17c]) (logid:) Detected out-of-band stop of a HA enabled VM VM instance {"id":3,"instanceName":"i-2-3-VM","state":"Running","type":"User","uuid":"68a0af3d-5f59-4480-95dd-4af1456d6d2f"}, will schedule restart.
2026-04-10 10:09:28,744 DEBUG [c.c.h.HighAvailabilityManagerExtImpl] (AgentManager-Handler-12:[ctx-5542e17c]) (logid:) HA schedule restart
2026-04-10 10:09:28,750 INFO [c.c.h.HighAvailabilityManagerExtImpl] (AgentManager-Handler-12:[ctx-5542e17c]) (logid:) Schedule vm for HA: VM instance {"id":3,"instanceName":"i-2-3-VM","state":"Running","type":"User","uuid":"68a0af3d-5f59-4480-95dd-4af1456d6d2f"}
2026-04-10 10:09:28,750 DEBUG [c.c.h.HighAvailabilityManagerExtImpl] (AgentManager-Handler-12:[ctx-5542e17c]) (logid:) Wakeup workers HA
2026-04-10 10:09:28,750 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] (AgentManager-Handler-12:[ctx-5542e17c]) (logid:) Done with process of VM state report. host: Host {"id":1,"name":"ref-trl-11549-k-Mol8-kiran-chavala-kvm1","type":"Routing","uuid":"f65d0b2f-e1f2-47f6-887d-31f26b257e1a"}
2026-04-10 10:09:28,759 INFO [c.c.h.HighAvailabilityManagerExtImpl] (HA-Worker-2:[ctx-e2886928, work-1]) (logid:fcb78e90) Processing work HAWork[1-HA-3-Running-Investigating]
2026-04-10 10:09:28,768 DEBUG [c.c.h.HighAvailabilityManagerExtImpl] (HA-Worker-2:[ctx-e2886928, work-1]) (logid:fcb78e90) RESTART with HAWORK
2026-04-10 10:09:28,772 INFO [c.c.h.HighAvailabilityManagerExtImpl] (HA-Worker-2:[ctx-e2886928, work-1]) (logid:fcb78e90) HA on VM instance {"id":3,"instanceName":"i-2-3-VM","state":"Running","type":"User","uuid":"68a0af3d-5f59-4480-95dd-4af1456d6d2f"}
2026-04-10 10:09:28,775 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (HA-Worker-2:[ctx-e2886928, work-1]) (logid:fcb78e90) Wait time setting on com.cloud.agent.api.CheckVirtualMachineCommand is 20 seconds
2026-04-10 10:09:28,776 DEBUG [c.c.a.m.ClusteredAgentAttache] (HA-Worker-2:[ctx-e2886928, work-1]) (logid:fcb78e90) Seq 4-8020347986393432246: Routed from 32985566938254
2026-04-10 10:09:28,776 DEBUG [c.c.a.t.Request] (HA-Worker-2:[ctx-e2886928, work-1]) (logid:fcb78e90) Seq 1-8020347986393432246: Sending { Cmd , MgmtId: 32985566938254, via: 1(
ref-trl-11549-k-Mol8-kiran-chavala-kvm1), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckVirtualMachineCommand":{"vmName":"i-2-3-VM","wait":"20","bypassHostMaintenance":"false"}}] }
2026-04-10 10:09:28,785 DEBUG [c.c.a.t.Request] (AgentManager-Handler-13:[]) (logid:) Seq 1-8020347986393432246: Processing: { Ans: , MgmtId: 32985566938254, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.CheckVirtualMachineAnswer":{"state":"PowerOff","result":"true","wait":"0","bypassHostMaintenance":"false"}}] }
2026-04-10 10:09:28,785 DEBUG [c.c.a.t.Request] (HA-Worker-2:[ctx-e2886928, work-1]) (logid:fcb78e90) Seq 1-8020347986393432246: Received: { Ans: , MgmtId: 32985566938254, via: 1(ref-trl-11549-k-Mol8-kiran-chavala-kvm1), Ver: v1, Flags: 10, { CheckVirtualMachineAnswer } }
2026-04-10 10:09:36,141 INFO [c.c.h.HighAvailabilityManagerExtImpl] (HA-Worker-2:[ctx-e2886928, work-1]) (logid:fcb78e90) HA is now restarting VM instance {"id":3,"instanceName":"i-2-3-VM","state":"Running","type":"User","uuid":"68a0af3d-5f59-4480-95dd-4af1456d6d2f"} on Host {"id":1,"name":"ref-trl-11549-k-Mol8-kiran-chavala-kvm1","type":"Routing","uuid":"f65d0b2f-e1f2-47f6-887d-31f26b257e1a"}
2026-04-10 10:09:36,157 INFO [c.c.h.HighAvailabilityManagerExtImpl] (HA-Worker-2:[ctx-e2886928, work-1]) (logid:fcb78e90) Completed work HAWork[1-HA-3-Running-Scheduled]. Took 1/5 attempts.
Description
This PR should fix #12920
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?