Add support for virtio device reset#5891
Conversation
The reset() trait method returns Option<(Arc<dyn VirtioInterrupt>, Vec<EventFd>)> for no apparent reason and the values are never used by any caller. Change the return value to a bool to signify success or failure. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Simplify the MMIO transport reset logic by always performing the MMIO transport reset unconditionally and calling device.reset() regardless of activation state. This requires updating the affected unit tests and implementing reset() for the DummyDevice. The unit test assumed that reset() is never supported (i.e. always returns false), but that is going to change soon so we want to make sure that resetting moves the device to the deactivated state. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The reset success path locked self.device, then tried to lock it again via self.virtio_device().lock() to reset queues. This was a deadlock that was never triggered because no device previously implemented reset() (all returned false). Use the already-held guard instead. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Don't clear virtio_interrupt on reset. The interrupt object holds Arc references to the PCI device's MSI-X configuration which remains valid across reset. Clearing it will cause a panic at the unwrap() in the activation path when the guest re-probes the device after reset. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Reset device_feature_select and driver_feature_select to 0 on device reset, matching what the MMIO transport does with features_select and acked_features_select. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add test_bus_device_reset_failure to verify that when device.reset() returns false, the FAILED bit is set in device_status. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a deactivate() method that sets the device state to Inactive. This will be used by the generic reset implementation in a subsequent patch. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Move the queue reset logic from the MMIO and PCI transport code into the default reset() implementation in the VirtioDevice trait. This is generic virtio state that should be reset for all devices, regardless of transport. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Complete the reset() implementation by deactivating the device and setting acked_features to 0. This must happen for all virtio backends. Introduce a _reset() method that is called at the end of reset() and can be overridden by virtio backends with backend specific reset code. The default _reset() returns false, indicating the backend does not support reset. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a clear() method to RxBuffers that resets all fields to their initial state. This will be used by the virtio-net reset implementation. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Implement the reset() method for the virtio-net device, resetting the net specific state in-place. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a test that verifies virtio-net device reset works end-to-end by unbinding and rebinding the guest driver. Suggested-by: Adam Jensen <adam@acj.sh> Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Implement the reset() method for the virtio-block device, resetting the device-specific state in-place. This only adds support for the Virtio backend for now and not VhostUser. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a test that verifies virtio-block device reset works end-to-end by unbinding and rebinding the guest driver for a scratch block device. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Pmem does not have any backend specific state that needs resetting so implement the method by simply returning true. The rest of the state is handled by the generic reset(). Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a test that verifies virtio-pmem device reset works end-to-end by unbinding and rebinding the guest driver. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Implement the reset() method for the virtio-balloon device, resetting the balloon specific state in-place. Do not deflate the balloon on reset. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a test that verifies virtio-balloon device reset works end-to-end by unbinding and rebinding the guest driver. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The entropy device does not have any backend specific state that needs resetting so implement the method by simply returning true. The test_failed_reset_blocks_reinitialization() test used an entropy device assuming it doesn't support reset. Since it now does, adjust the test to check that the device is first deactivated after a reset and then the driver can perform the initialization sequence again. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a test that verifies virtio-rng device reset works end-to-end by unbinding and rebinding the guest driver. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Implement the reset() method for the virtio-vsock device, resetting the vsock specific state in-place. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a test that verifies virtio-vsock device reset works end-to-end by unbinding and rebinding the guest driver. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The virtio-mem device does not have any backend specific state that needs resetting so implement the method by simply returning true. Plugged memory regions remain intact. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a test that verifies virtio-mem device reset works end-to-end by unbinding and rebinding the guest driver. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
firecracker-microvm#5891 Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5891 +/- ##
==========================================
- Coverage 82.90% 82.77% -0.13%
==========================================
Files 277 277
Lines 30053 30115 +62
==========================================
+ Hits 24914 24929 +15
- Misses 5139 5186 +47
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| b.is_io_engine_throttled = false; | ||
| true | ||
| } | ||
| Self::VhostUser(_) => false, |
There was a problem hiding this comment.
nit: we could leave a comment about this being unsupported
| fn _reset(&mut self) -> bool { | ||
| self.rx_packet.clear(); | ||
| self.tx_packet.clear(); | ||
| true | ||
| } |
There was a problem hiding this comment.
is there some backend state to reset? what about the conn tracking etc? We should probably drop connections, etc.
Also, cc @JackThomson2 as this may be a subtle conflict with your PR (depending on who merges first haha)
| # Rebind and verify it's back. | ||
| vm.ssh.check_output( | ||
| f"echo {virtio_dev} > /sys/bus/virtio/drivers/vmw_vsock_virtio_transport/bind" | ||
| ) | ||
| vm.ssh.check_output("ls /sys/bus/virtio/drivers/vmw_vsock_virtio_transport/virtio*") |
There was a problem hiding this comment.
let's check the device still woks
| fn _reset(&mut self) -> bool { | ||
| true | ||
| } | ||
|
|
There was a problem hiding this comment.
this may be confusing. We could point to the spec:
The device MUST NOT change the state of memory blocks during device reset
The device MUST NOT modify memory or memory properties of plugged memory blocks during device reset.
| for queue in self.queues_mut() { | ||
| *queue = Queue::new(queue.max_size); | ||
| } | ||
| self._reset() |
There was a problem hiding this comment.
should this be called before the rest? This is a semantic change from the old code where a failed reset left the device activated.
| self.device_state = DeviceState::Inactive; | ||
| } | ||
|
|
||
| fn _reset(&mut self) -> bool { |
There was a problem hiding this comment.
should we disarm the stats timer?
| self.device_state.is_activated() | ||
| } | ||
|
|
||
| fn deactivate(&mut self) { |
There was a problem hiding this comment.
what about the rate limiters? are we assuming we want to preserve the state across reset?
| vm.ssh.check_output( | ||
| f"echo {virtio_dev} > /sys/bus/virtio/drivers/virtio_balloon/bind" | ||
| ) | ||
| vm.ssh.check_output("ls /sys/bus/virtio/drivers/virtio_balloon/virtio*") |
There was a problem hiding this comment.
we should check device is functional
Add support for virtio device reset
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.PR Checklist
tools/devtool checkbuild --allto verify that the PR passesbuild checks on all supported architectures.
tools/devtool checkstyleto verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md.Runbook for Firecracker API changes.
integration tests.
TODO.rust-vmm.