qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Out-of-Process Device Emulation session at KVM Forum 2020


From: Stefan Hajnoczi
Subject: Re: Out-of-Process Device Emulation session at KVM Forum 2020
Date: Fri, 30 Oct 2020 11:13:59 +0000

On Fri, Oct 30, 2020 at 9:46 AM Jason Wang <jasowang@redhat.com> wrote:
> On 2020/10/30 下午2:21, Stefan Hajnoczi wrote:
> > On Fri, Oct 30, 2020 at 3:04 AM Alex Williamson
> > <alex.williamson@redhat.com> wrote:
> >> It's great to revisit ideas, but proclaiming a uAPI is bad solely
> >> because the data transfer is opaque, without defining why that's bad,
> >> evaluating the feasibility and implementation of defining a well
> >> specified data format rather than protocol, including cross-vendor
> >> support, or proposing any sort of alternative is not so helpful imo.
> > The migration approaches in VFIO and vDPA/vhost were designed for
> > different requirements and I think this is why there are different
> > perspectives on this. Here is a comparison and how VFIO could be
> > extended in the future. I see 3 levels of device state compatibility:
> >
> > 1. The device cannot save/load state blobs, instead userspace fetches
> > and restores specific values of the device's runtime state (e.g. last
> > processed ring index). This is the vhost approach.
> >
> > 2. The device can save/load state in a standard format. This is
> > similar to #1 except that there is a single read/write blob interface
> > instead of fine-grained get_FOO()/set_FOO() interfaces. This approach
> > pushes the migration state parsing into the device so that userspace
> > doesn't need knowledge of every device type. With this approach it is
> > possible for a device from vendor A to migrate to a device from vendor
> > B, as long as they both implement the same standard migration format.
> > The limitation of this approach is that vendor-specific state cannot
> > be transferred.
> >
> > 3. The device can save/load opaque blobs. This is the initial VFIO
> > approach.
>
>
> I still don't get why it must be opaque.

If the device state format needs to be in the VMM then each device
needs explicit enablement in each VMM (QEMU, cloud-hypervisor, etc).

Let's invert the question: why does the VMM need to understand the
device state of a _passthrough_ device?

> >   A device from vendor A cannot migrate to a device from
> > vendor B because the format is incompatible. This approach works well
> > when devices have unique guest-visible hardware interfaces so the
> > guest wouldn't be able to handle migrating a device from vendor A to a
> > device from vendor B anyway.
>
>
> For VFIO I guess cross vendor live migration can't succeed unless we do
> some cheats in device/vendor id.

Yes. I haven't looked into the details of PCI (Sub-)Device/Vendor IDs
and how to best enable migration but I hope that can be solved. The
simplest approach is to override the IDs and make them part of the
guest configuration.

> For at least virtio, they will still go with virtio/vDPA. The advantages
> are:
>
> 1) virtio/vDPA can serve kernel subsystems which VFIO can't, this is
> very important for containers

I'm not sure I understand this. If the kernel wants to use the device
then it doesn't use VFIO, it runs the kernel driver instead.

One part I believe is missing from VFIO/mdev is attaching an mdev
device to the kernel. That seems to be an example of the limitation
you mentioned.

> 2) virtio/vDPA is bus independent, we can present a virtio-mmio device
> which is based on vDPA PCI hardware for e.g microvm

Yes. This is neat although microvm supports PCI now
(https://www.kraxel.org/blog/2020/10/qemu-microvm-acpi/).

> I'm not familiar with NVME but they should go with the same way instead
> of depending on VFIO.

There are pros/cons with both approaches. I'm not even sure all VIRTIO
hardware vendors will use vDPA. Two examples:
1. A tiny VMM with strict security requirements. The VFIO approach is
less complex because the VMM is much less involved with the device.
2. A vendor shipping a hardware VIRTIO PCI device as a PF - no SR-IOV,
no software VFs, just a single instance. A passthrough PCI device is a
much simpler way to deliver this device than vDPA + vhost + VMM
support.

vDPA is very useful but there are situations when the VFIO approach is
attractive too.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]