qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: use VFIO over a UNIX domain socket to implement device offloadi


From: Alex Williamson
Subject: Re: RFC: use VFIO over a UNIX domain socket to implement device offloading
Date: Thu, 14 May 2020 13:20:32 -0600

On Thu, 14 May 2020 09:32:15 -0700
John G Johnson <address@hidden> wrote:

>       Thanos and I have made some changes to the doc in response to the
> feedback we’ve received.  The biggest difference is that it is less reliant
> on the reader being familiar with the current VFIO implementation.  We’d
> appreciate any additional feedback you could give on the changes.  Thanks
> in advance.
> 
>                                                       Thanos and JJ
> 
> 
> The link remains the same:
> 
> https://docs.google.com/document/d/1FspkL0hVEnZqHbdoqGLUpyC38rSk_7HhY471TsVwyK8/edit?usp=sharing

Hi,

I'm confused by VFIO_USER_ADD_MEMORY_REGION vs VFIO_USER_IOMMU_MAP_DMA.
The former seems intended to provide the server with access to the
entire GPA space, while the latter indicates an IOVA to GPA mapping of
those regions.  Doesn't this break the basic isolation of a vIOMMU?
This essentially says to me "here's all the guest memory, but please
only access these regions for which we're providing DMA mappings".
That invites abuse.

Also regarding VFIO_USER_ADD_MEMORY_REGION, it's not clear to me how
"an array of file descriptors will be sent as part of the message
meta-data" works.  Also consider s/SUB/DEL/.  Why is the Device ID in
the table specified as 0?  How does a client learn their Device ID?

VFIO_USER_DEVICE_GET_REGION_INFO (or anything else making use of a
capability chain), the cap_offset and next pointers within the chain
need to specify what their offset is relative to (ie. the start of the
packet, the start of the vfio compatible data structure, etc).  I
assume the latter for client compatibility.

Also on REGION_INFO, offset is specified as "the base offset to be
given to the mmap() call for regions with the MMAP attribute".  Base
offset from what?  Is the mmap performed on the socket fd?  Do we not
allow read/write, we need to use VFIO_USER_MMIO_READ/WRITE instead?
Why do we specify "MMIO" in those operations versus simply "REGION"?
Are we arbitrarily excluding support for I/O port regions or device
specific regions?  If these commands replace direct read and write to
an fd offset, how is PCI config space handled?

VFIO_USER_MMIO_READ specifies the count field is zero and the reply
will include the count specifying the amount of data read.  How does
the client specify how much data to read?  Via message size?

VFIO_USER_DMA_READ/WRITE, is the address a GPA or IOVA?  IMO the device
should only ever have access via IOVA, which implies a DMA mapping
exists for the device.  Can you provide an example of why we need these
commands since there seems little point to this interface if a device
cannot directly interact with VM memory.

The IOMMU commands should be unnecessary, a vIOMMU should be
transparent to the server by virtue that the device only knows about
IOVA mappings accessible to the device.  Requiring the client to expose
all memory to the server implies that the server must always be trusted.

Interrupt info format, s/type/index/, s/vector/subindex/

In addition to the unused ioctls, the entire concept of groups and
containers are not found in this specification.  To some degree that
makes sense and even mdevs and typically SR-IOV VFs have a 1:1 device
to group relationship.  However, the container is very much involved in
the development of migration support, where it's the container that
provides dirty bitmaps.  Since we're doing map and unmap without that
container concept here, perhaps we'd equally apply those APIs to this
same socket.  Thanks,

Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]