qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] pci: Refuse to hotplug PCI Devices when the Guest OS is not


From: Marcel Apfelbaum
Subject: Re: [PATCH] pci: Refuse to hotplug PCI Devices when the Guest OS is not ready
Date: Fri, 23 Oct 2020 09:47:14 +0300

Hi David,

On Fri, Oct 23, 2020 at 6:49 AM David Gibson <dgibson@redhat.com> wrote:
On Thu, 22 Oct 2020 11:01:04 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Oct 22, 2020 at 05:50:51PM +0300, Marcel Apfelbaum wrote:
>  [...] 
>
> Right. After detecting just failing unconditionally it a bit too
> simplistic IMHO.

There's also another factor here, which I thought I'd mentioned
already, but looks like I didn't: I think we're still missing some
details in what's going on.

The premise for this patch is that plugging while the indicator is in
transition state is allowed to fail in any way on the guest side.  I
don't think that's a reasonable interpretation, because it's unworkable
for physical hotplug.  If the indicator starts blinking while you're in
the middle of shoving a card in, you'd be in trouble.

So, what I'm assuming here is that while "don't plug while blinking" is
the instruction for the operator to obey as best they can, on the guest
side the rule has to be "start blinking, wait a while and by the time
you leave blinking state again, you can be confident any plugs or
unplugs have completed".  Obviously still racy in the strict computer
science sense, but about the best you can do with slow humans in the
mix.

So, qemu should of course endeavour to follow that rule as though it
was a human operator on a physical machine and not plug when the
indicator is blinking.  *But* the qemu plug will in practice be fast
enough that if we're hitting real problems here, it suggests the guest
is still doing something wrong.

I personally think there is a little bit of over-engineering here.
Let's start with the spec:

    Power Indicator Blinking
    A blinking Power Indicator indicates that the slot is powering up or powering down and that
    insertion or removal of the adapter is not permitted.

What exactly is an interpretation here?
As you stated, the races are theoretical, the whole point of the indicator
is to let the operator know he can't plug the device just yet.

I understand it would be more user friendly if the QEMU would wait internally for the
blinking to end, but the whole point of the indicator is to let the operator (human or machine)
know they can't plug the device at a specific time.
Should QEMU take the responsibility of the operator? Is it even correct?

Even if we would want such a feature, how is it related to this patch?
The patch simply refuses to start a hotplug operation when it knows it will not succeed. 
 
Another way that would make sense to me would be  is a new QEMU interface other than
"add_device", let's say "adding_device_allowed", that would return true if the hotplug is allowed
at this point of time. (I am aware of the theoretical races) 

The above will at least mimic the mechanics of the pyhs world.  The operator looks at the indicator,
the management software checks if adding the device is allowed.
Since it is a corner case I would prefer the device_add to fail rather than introducing a new interface,
but that's just me.

Thanks,
Marcel


--
David Gibson <dgibson@redhat.com>
Principal Software Engineer, Virtualization, Red Hat

reply via email to

[Prev in Thread] Current Thread [Next in Thread]