qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] 9pfs: fix init_in_iov_from_pdu truncating size


From: Christian Schoenebeck
Subject: Re: [PATCH 2/2] 9pfs: fix init_in_iov_from_pdu truncating size
Date: Thu, 14 May 2020 16:24:59 +0200

On Donnerstag, 14. Mai 2020 01:31:30 CEST Stefano Stabellini wrote:
> > If there is really no way with Xen to ensure there's always a buffer size
> > according to msize,
> 
> Thank you for the long writeup. Let's see if we can come up with a
> good plan.
> 
> The Xen transport
> (https://xenbits.xenproject.org/docs/4.13-testing/misc/9pfs.html) has a
> shared memory area with 64 pages, so 256KB. It is split in half: 128KB
> for requests and 128KB for responses.

That's very little. That also means you won't be able to ever achieve decent 
9pfs performance with Xen, because that requires a much larger msize of 
approximately msize >= 4MB, depending on the underlying I/O hardware, tendency 
growing (due to rising I/O hardware bandwidth).

> The original error message is the following:
> 
>   xen be: 9pfs-0: xen be: 9pfs-0: Xen 9pfs request type 116 needs 126987
> bytes, buffer has 126965 Xen 9pfs request type 116needs 126987 bytes,
> buffer has 126965
> 
> and msize was set to 131072 in the Linux client.
> 
> So transport_size == msize. However, there can be multiple requests and
> responses inflight at any given time. It means that even with
> transport_size=128KB and msize=4KB we could still have so many outstanding
> requests that the available memory is less then 4KB.
> 
> I admit that with transport_size=128KB and msize=4KB it would be so
> unlikely to happen that it would be close to impossible to reproduce the
> error. But theoretically it is possible.

Even though msize=4kiB is the default value with Linux clients, I would never 
use such a small value as it would cause an extremely poor 9pfs performance. 
My plan BTW is to log a performance warning soon if a client is using such a 
little msize.

> If we can't come up with any better ideas, we could simply limit the
> msize in the Linux client to something like transport_size/2 or
> transport_size/4, but it doesn't feel right.

That sounds like it would just reduce likeliness for the low buffer situation 
to happen with Xen, it would not prevent it though.

As far as I understand you now, the problem with Xen transport seems to be 
that clients can submit requests more quickly than the 9pfs server could 
process. That would change the overall situation completely, because my hack 
solution with delaying delivery (yield) suggested previously, was based on the 
assumption that this low transport buffer scenario only happens once on boot, 
but not possibly at any later point again and again.

> > As you can see it all boils down to one thing: better look for a solution
> > for Xen to deliver the required buffer size, or in worst case: yield the
> > coroutine, i.e. delay delivery of response until that can be assured by
> > Xen.
> That's might be the best option. But it would have to be
> xen_9pfs_init_in_iov_from_pdu to call qemu_coroutine_yield() in a loop?
> Like:
> 
> 
> diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
> index 18fe5b7c92..ef1db47d0c 100644
> --- a/hw/9pfs/xen-9p-backend.c
> +++ b/hw/9pfs/xen-9p-backend.c
> @@ -197,18 +197,13 @@ static void xen_9pfs_init_in_iov_from_pdu(V9fsPDU
> *pdu, g_free(ring->sg);
> 
>      ring->sg = g_new0(struct iovec, 2);
> -    xen_9pfs_in_sg(ring, ring->sg, &num, pdu->idx, *size);
> 
> +again:
> +    xen_9pfs_in_sg(ring, ring->sg, &num, pdu->idx, *size);
>      buf_size = iov_size(ring->sg, num);
> -    if (buf_size  < P9_IOHDRSZ) {
> -        xen_pv_printf(&xen_9pfs->xendev, 0, "Xen 9pfs request type %d"
> -                "needs %zu bytes, buffer has %zu, less than minimum\n",
> -                pdu->id, *size, buf_size);
> -        xen_be_set_state(&xen_9pfs->xendev, XenbusStateClosing);
> -        xen_9pfs_disconnect(&xen_9pfs->xendev);
> -    }
>      if (buf_size  < *size) {
> -        *size = buf_size;
> +        qemu_coroutine_yield();
> +        goto again;
>      }
> 
>      *piov = ring->sg;
> 
> 
> How is the coroutine going to get scheduled again? Is it automatic or do
> we need to call it again explicitly when we get the next notification
> from the client, which should arrive as soon as the client finishes
> reading the responses?

I "think" the coroutine would be resumed in the next main loop iteration. But 
I am still reading up on qemu's coroutine implementation details myself. So 
that might work, yes, even though in a hackish way.

And as explained above, if this situation not only happens on boot, then you 
might consider a different approach. There is also yield_until_fd_readable(int 
fd); BTW. However the obvious step would be increasing Xen transport capacity 
tremendously at first place.

Looks like this issue will still take quite some time to be fixed with Xen. If 
you don't mind I'll send out a patch to revert truncation on virtio side, so 
that at least this bug is fixed with virtio ASAP.

Best regards,
Christian Schoenebeck





reply via email to

[Prev in Thread] Current Thread [Next in Thread]