qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster


From: Kevin Wolf
Subject: Re: [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster
Date: Wed, 19 Aug 2020 17:07:11 +0200

Am 19.08.2020 um 16:25 hat Alberto Garcia geschrieben:
> On Mon 17 Aug 2020 05:53:07 PM CEST, Kevin Wolf wrote:
> >> > Or are you saying that ZERO_RANGE + pwrite on a sparse file (=
> >> > cluster allocation) is faster for you than just the pwrite alone (=
> >> > writing to already allocated cluster)?
> >> 
> >> Yes, 20% faster in my tests (4KB random writes), but in the latter
> >> case the cluster is already allocated only at the qcow2 level, not on
> >> the filesystem. preallocation=falloc is faster than
> >> preallocation=metadata (preallocation=off sits in the middle).
> >
> > Hm, this feels wrong. Doing more operations should never be faster
> > than doing less operations.
> >
> > Maybe the difference is in allocating 64k at once instead of doing a
> > separate allocation for every 4k block? But with the extent size hint
> > patches to file-posix, we should allocate 1 MB at once by default now
> > (if your test image was newly created). Can you check whether this is
> > in effect for your image file?
> 
> I checked with xfs on my computer. I'm not very familiar with that
> filesystem so I was using the default options and I didn't tune
> anything.
> 
> What I got with my tests (using fio):
> 
> - Using extent_size_hint didn't make any difference in my test case (I
>   do see a clear difference however with the test case described in
>   commit ffa244c84a).

Hm, interesting. What is your exact fio configuration? Specifically,
which iodepth are you using? I guess with a low iodepth (and O_DIRECT),
the effect of draining the queue might not be as visible.

> - preallocation=off is still faster than preallocation=metadata.

Brian, can you help us here with some input?

Essentially what we're having here is a sparse image file on XFS that is
opened with O_DIRECT (presumably - Berto, is this right?), and Berto is
seeing cases where a random write benchmark is faster if we're doing the
64k ZERO_RANGE + 4k pwrite when touching a 64k cluster for the first
time compared to always just doing the 4k pwrite. This is with a 1 MB
extent size hint.

>From the discussions we had the other day [1][2] I took away that your
suggestion is that we should not try to optimise things with
fallocate(), but just write the areas we really want to write and let
the filesystem deal with the sparse parts. Especially with the extent
size hint that we're now setting, I'm surprised to hear that doing a
ZERO_RANGE first still seems to improve the performance.

Do you have any idea why this is happening and what we should be doing
with this?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1850660
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1666864

>   If I disable handle_alloc_space() (so there is no ZERO_RANGE used)
>   then it is much slower.

This makes some sense because then we're falling back to writing
explicit zero buffers (unless you disabled that, too).

> - With preallocation=falloc I get the same results as with
>   preallocation=metadata.

Interesting, this means that the fallocate() call costs you basically no
time. I would have expected preallocation=falloc to be a little faster.

> - preallocation=full is the fastest by far.

I guess this saves the conversion of unwritten extents to fully
allocated ones?

As the extent size hint doesn't seem to influence your test case anyway,
can I assume that ext4 behaves similar to XFS in all four cases?

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]