qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug 1925496] Re: nvme disk cannot be hotplugged after removal


From: Klaus Jensen
Subject: Re: [Bug 1925496] Re: nvme disk cannot be hotplugged after removal
Date: Mon, 3 May 2021 09:27:21 +0200

On Apr 28 15:00, Max Reitz wrote:
On 28.04.21 12:12, Klaus Jensen wrote:
On Apr 28 09:31, Oguz Bektas wrote:
My understanding is that this is the expected behavior. The reason is
that the drive cannot be deleted immediately when the device is
hot-unplugged, since it might not be safe (other parts of QEMU could
be using it, like background block jobs).

On the other hand, the fact that if the drive is removed explicitly
through QMP (or in the monitor with drive_del), the drive id is
remains "in use". This might be a completely different bug that is
unrelated to the nvme device.

using the same commands I can hot-plug and hot-unplug a scsi disk like
this without issue - this behavior only appeared on nvme devices.


Kevin, Max, can you shed any light on this?

Specifically what the expected behavior is wrt. to the drive when unplugging a device that has one attached?

If the scsi disk is capable of "cleaning up" immediately, then I suppose that some steps are missing in the nvme unrealization.


Hi Max,

Thanks for your help!

I’m not very strong when it comes to question for guest devices, but looking into the QAPI documentation, it says that when DEVICE_DELETED is emitted, it’s safe to reuse the device ID. Before then, it isn’t, because the guest may yet need to release the device.


This is specifically related to releasing the "drive", not the device. Problem is that when the device is deleted (using device_del), the pci device goes away rapidly, but the drive (as shown in `info block`) lingers for a couple of seconds before going into the "unlocked" state. If the drive is then deleted (`drive_del`) everything is good, but if the drive is deleted within those couple of seconds, the drive_del completes successfully, but the drive id never becomes available again.

So the questions that come to my mind are:
- Do nvme guest devices have a release protocol with the guest, so that it just may take some time for the guest to release the device? I.e. that this would be perfectly normal and documented behavior? (Perhaps this just isn’t the case for scsi, or the guest just releases those devices much quicker)


The NVMe device is a PCIDevice, I wonder if that is what adds some delay on this?

- Did Oguz always wait for the DEVICE_DELETED event before reusing the ID? Sounds like it would be a bug if reusing the ID after getting that event failed.


From the bug report, it does not look like anything like that is done.

I basically dont understand the deletion protocol here and why the drive is not released immediately. Even if I add a call to blockdev_mark_auto_del() there is a delay, but the drive does get automatically deleted.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]