qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests


From: Stefan Hajnoczi
Subject: Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
Date: Fri, 6 Sep 2024 09:15:32 -0400

On Fri, 6 Sept 2024 at 03:06, Albert Esteve <aesteve@redhat.com> wrote:
> On Thu, Sep 5, 2024 at 6:39 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>
>> On Tue, Sep 03, 2024 at 10:42:34AM +0200, Albert Esteve wrote:
>> > Hello all,
>> >
>> > Sorry, I have been a bit disconnected from this thread as I was on
>> > vacations and then had to switch tasks for a while.
>> >
>> > I will try to go through all comments and address them for the first
>> > non-RFC drop of this patch series.
>> >
>> > But I was discussing with some colleagues on this. So turns out rust-vmm's
>> > vhost-user-gpu will potentially use
>> > this soon, and a rust-vmm/vhost patch have been already posted:
>> > https://github.com/rust-vmm/vhost/pull/251.
>> > So I think it may make sense to:
>> > 1. Split the vhost-user documentation patch once settled. Since it is taken
>> > as the official spec,
>> >     having it upstreamed independently of the implementation will benefit
>> > other projects to
>> >     work/integrate their own code.
>> > 2. Split READ_/WRITE_MEM messages from SHMEM_MAP/_UNMAP patches.
>> >     If I remember correctly, this addresses a virtio-fs specific issue,
>> > that will not
>> >     impact either virtio-gpu nor virtio-media, or any other.
>>
>> This is an architectural issue that arises from exposing VIRTIO Shared
>> Memory Regions in vhost-user. It was first seen with Linux virtiofs but
>> it could happen with other devices and/or guest operating systems.
>>
>> Any VIRTIO Shared Memory Region that can be mmapped into Linux userspace
>> may trigger this issue. Userspace may write(2) to an O_DIRECT file with
>> the mmap as the source. The vhost-user-blk device will not be able to
>> access the source device's VIRTIO Shared Memory Region and will fail.
>>
>> > So it may make
>> > sense
>> >     to separate them so that one does not stall the other. I will try to
>> > have both
>> >     integrated in the mid term.
>>
>> If READ_/WRITE_MEM is a pain to implement (I think it is in the
>> vhost-user back-end, even though I've been a proponent of it), then
>> another way to deal with this issue is to specify that upon receiving
>> MAP/UNMAP messages, the vhost-user front-end must update the vhost-user
>> memory tables of all other vhost-user devices. That way vhost-user
>> devices will be able to access VIRTIO Shared Memory Regions mapped by
>> other devices.
>>
>> Implementing this in QEMU should be much easier than implementing
>> READ_/WRITE_MEM support in device back-ends.
>>
>> This will be slow and scale poorly but performance is only a problem for
>> devices that frequently MAP/UNMAP like virtiofs. Will virtio-gpu and
>> virtio-media use MAP/UNMAP often at runtime? They might be able to get
>> away with this simple solution.
>>
>> I'd be happy with that. If someone wants to make virtiofs DAX faster,
>> they can implement READ/WRITE_MEM or another solution later, but let's
>> at least make things correct from the start.
>
>
> I agree. I want it to be correct first. If you agree on splitting the spec 
> bits from this
> patch I'm already happy. I suggested splitting READ_/WRITE_MEM messages
> because I thought that it was a virtiofs-specific issue.
>
> The alternative that you proposed is interesting. I'll take it into account. 
> But I
> feel I prefer to go for the better solution, and if I get too entangled, then 
> switch
> to the easier implementation.

Great. The difficult part to implementing READ_/WRITE_MEM messages is
modifying libvhost-user and rust-vmm's vhost crate to send the new
messages when address translation fails. This needs to cover all
memory accesses (including vring struct accesses). That code may be a
few levels down in the call stack and assume it can always load/store
directly from mmapped memory.

>
> I think we could do this in 2 patches:
> 1. Split the documentation bits for SHMEM_MAP/_UNMAP. The
>     implementation for these messages will go into the second patch.
> 2. The implementation patch: keep going for the time being with
>      READ_/WRITE_MEM support. And the documentation for that
>     is kept it within this patch. This way if we switch to the frontend
>     updating vhost-user memory table, we weren't set in any specific
>     solution if patch 1 has been already merged.

I'm happy as long as the vhost-user spec patch that introduces
MAP/UNMAP also covers a solution for the memory access problem (either
READ_/WRITE_MEM or propagating mappings to all vhost-user back-ends).

Stefan

>
> BR,
> Albert.
>
>>
>>
>> Stefan
>>
>> >
>> > WDYT?
>> >
>> > BR,
>> > Albert.
>> >
>> > On Tue, Jul 16, 2024 at 3:21 AM David Stevens <stevensd@chromium.org> 
>> > wrote:
>> >
>> > > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin <mst@redhat.com> 
>> > > wrote:
>> > > >
>> > > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote:
>> > > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross <hi@alyssa.is> wrote:
>> > > > > >
>> > > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in
>> > > > > > crosvm a couple of years ago.
>> > > > > >
>> > > > > > David, I'd be particularly interested for your thoughts on the
>> > > MEM_READ
>> > > > > > and MEM_WRITE commands, since as far as I know crosvm doesn't
>> > > implement
>> > > > > > anything like that.  The discussion leading to those being added
>> > > starts
>> > > > > > here:
>> > > > > >
>> > > > > >
>> > > https://lore.kernel.org/qemu-devel/20240604185416.GB90471@fedora.redhat.com/
>> > > > > >
>> > > > > > It would be great if this could be standardised between QEMU and
>> > > crosvm
>> > > > > > (and therefore have a clearer path toward being implemented in 
>> > > > > > other
>> > > VMMs)!
>> > > > >
>> > > > > Setting aside vhost-user for a moment, the DAX example given by 
>> > > > > Stefan
>> > > > > won't work in crosvm today.
>> > > > >
>> > > > > Is universal access to virtio shared memory regions actually mandated
>> > > > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing
>> > > > > seems reasonable enough, but what about virtio-pmem to virtio-blk?
>> > > > > What about screenshotting a framebuffer in virtio-gpu shared memory 
>> > > > > to
>> > > > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in 
>> > > > > a
>> > > > > virtualized environment. But what about when you have real hardware
>> > > > > that speaks virtio involved? That's outside my wheelhouse, but it
>> > > > > doesn't seem like that would be easy to solve.
>> > > >
>> > > > Yes, it can work for physical devices if allowed by host configuration.
>> > > > E.g. VFIO supports that I think. Don't think VDPA does.
>> > >
>> > > I'm sure it can work, but that sounds more like a SHOULD (MAY?),
>> > > rather than a MUST.
>> > >
>> > > > > For what it's worth, my interpretation of the target scenario:
>> > > > >
>> > > > > > Other backends don't see these mappings. If the guest submits a 
>> > > > > > vring
>> > > > > > descriptor referencing a mapping to another backend, then that
>> > > backend
>> > > > > > won't be able to access this memory
>> > > > >
>> > > > > is that it's omitting how the implementation is reconciled with
>> > > > > section 2.10.1 of v1.3 of the virtio spec, which states that:
>> > > > >
>> > > > > > References into shared memory regions are represented as offsets 
>> > > > > > from
>> > > > > > the beginning of the region instead of absolute memory addresses.
>> > > Offsets
>> > > > > > are used both for references between structures stored within 
>> > > > > > shared
>> > > > > > memory and for requests placed in virtqueues that refer to shared
>> > > memory.
>> > > > >
>> > > > > My interpretation of that statement is that putting raw guest 
>> > > > > physical
>> > > > > addresses corresponding to virtio shared memory regions into a vring
>> > > > > is a driver spec violation.
>> > > > >
>> > > > > -David
>> > > >
>> > > > This really applies within device I think. Should be clarified ...
>> > >
>> > > You mean that a virtio device can use absolute memory addresses for
>> > > other devices' shared memory regions, but it can't use absolute memory
>> > > addresses for its own shared memory regions? That's a rather strange
>> > > requirement. Or is the statement simply giving an addressing strategy
>> > > that device type specifications are free to ignore?
>> > >
>> > > -David
>> > >
>> > >



reply via email to

[Prev in Thread] Current Thread [Next in Thread]