[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Qemu-block] [PATCH] block: posix: Always allocate the
From: |
John Snow |
Subject: |
Re: [Qemu-devel] [Qemu-block] [PATCH] block: posix: Always allocate the first block |
Date: |
Fri, 16 Aug 2019 19:00:55 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 |
On 8/16/19 6:45 PM, Nir Soffer wrote:
> On Sat, Aug 17, 2019 at 12:57 AM John Snow <address@hidden
> <mailto:address@hidden>> wrote:
>
> On 8/16/19 5:21 PM, Nir Soffer wrote:
> > When creating an image with preallocation "off" or "falloc", the first
> > block of the image is typically not allocated. When using Gluster
> > storage backed by XFS filesystem, reading this block using direct I/O
> > succeeds regardless of request length, fooling alignment detection.
> >
> > In this case we fallback to a safe value (4096) instead of the optimal
> > value (512), which may lead to unneeded data copying when aligning
> > requests. Allocating the first block avoids the fallback.
> >
>
> Where does this detection/fallback happen? (Can it be improved?)
>
>
> In raw_probe_alignment().
>
> This patch explain the issues:
> https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00568.html
>
> Here Kevin and me discussed ways to improve it:
> https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00426.html
>
Thanks for the reading!
That does help explain this patch better.
> > When using preallocation=off, we always allocate at least one
> filesystem
> > block:
> >
> > $ ./qemu-img create -f raw test.raw 1g
> > Formatting 'test.raw', fmt=raw size=1073741824
> >
> > $ ls -lhs test.raw
> > 4.0K -rw-r--r--. 1 nsoffer nsoffer 1.0G Aug 16 23:48 test.raw
> >
> > I did quick performance tests for these flows:
> > - Provisioning a VM with a new raw image.
> > - Copying disks with qemu-img convert to new raw target image
> >
> > I installed Fedora 29 server on raw sparse image, measuring the time
> > from clicking "Begin installation" until the "Reboot" button appears:
> >
> > Before(s) After(s) Diff(%)
> > -------------------------------
> > 356 389 +8.4
> >
> > I ran this only once, so we cannot tell much from these results.
> >
>
> That seems like a pretty big difference for just having pre-allocated a
> single block. What was the actual command line / block graph for
> that test?
>
>
> Having the first block allocated changes the alignment.
>
> Before this patch, we detect request_alignment=1, so we fallback to 4096.
> Then we detect buf_align=1, so we fallback to value of request alignment.
>
> The guest see a disk with:
> logical_block_size = 512
> physical_block_size = 512
>
> But qemu uses:
> request_alignment = 4096
> buf_align = 4096
>
> storage uses:
> logical_block_size = 512
> physical_block_size = 512
>
> If the guest does direct I/O using 512 bytes aligment, qemu has to copy
> the buffer to align them to 4096 bytes.
>
> After this patch, qemu detects the alignment correctly, so we have:
>
> guest
> logical_block_size = 512
> physical_block_size = 512
>
> qemu
> request_alignment = 512
> buf_align = 512
>
> storage:
> logical_block_size = 512
> physical_block_size = 512
>
> We expect this to be more efficient because qemu does not have to emulate
> anything.
>
> Was this over a network that could explain the variance?
>
>
> Maybe, this is complete install of Fedora 29 server, I'm not sure if the
> installation
> access the network.
>
> > The second test was cloning the installation image with qemu-img
> > convert, doing 10 runs:
> >
> > for i in $(seq 10); do
> > rm -f dst.raw
> > sleep 10
> > time ./qemu-img convert -f raw -O raw -t none -T none
> src.raw dst.raw
> > done
> >
> > Here is a table comparing the total time spent:
> >
> > Type Before(s) After(s) Diff(%)
> > ---------------------------------------
> > real 530.028 469.123 -11.4
> > user 17.204 10.768 -37.4
> > sys 17.881 7.011 -60.7
> >
> > Here we see very clear improvement in CPU usage.
> >
>
> Hard to argue much with that. I feel a little strange trying to force
> the allocation of the first block, but I suppose in practice "almost no
> preallocation" is indistinguishable from "exactly no preallocation" if
> you squint.
>
>
> Right.
>
> The real issue is that filesystems and block devices do not expose the
> alignment
> requirement for direct I/O, so we need to use these hacks and assumptions.
>
> With local XFS we use xfsctl(XFS_IOC_DIOINFO) to get request_alignment,
> but this does
> not help for XFS filesystem used by Gluster on the server side.
>
> I hope that Niels is working on adding similar ioctl for Glsuter, os it
> can expose the properties
> of the remote filesystem.
>
> Nir
That sounds quite a bit less hacky, but I agree we still have to do what
we can in the meantime.
(It looks like you've been hashing this out with Kevin for a while, so
I'm going to sheepishly defer to his judgment on this patch. While I
think it's probably a fine trade-off, I can't really say off-hand if
there's a better, more targeted way to accomplish it.)
--js
- [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Nir Soffer, 2019/08/16
- Re: [Qemu-devel] [Qemu-block] [PATCH] block: posix: Always allocate the first block, John Snow, 2019/08/16
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Nir Soffer, 2019/08/22
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Max Reitz, 2019/08/22
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Nir Soffer, 2019/08/22
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Max Reitz, 2019/08/22
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Nir Soffer, 2019/08/22
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Max Reitz, 2019/08/23
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Nir Soffer, 2019/08/23
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Max Reitz, 2019/08/23
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Nir Soffer, 2019/08/23
- Re: [Qemu-devel] [PATCH] block: posix: Always allocate the first block, Max Reitz, 2019/08/23