[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS
From: |
Vladimir Sementsov-Ogievskiy |
Subject: |
Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS |
Date: |
Fri, 1 Nov 2019 10:28:57 +0000 |
01.11.2019 13:20, Max Reitz wrote:
> On 01.11.19 11:00, Max Reitz wrote:
>> Hi,
>>
>> This series builds on the previous RFC. The workaround is now applied
>> unconditionally of AIO mode and filesystem because we don’t know those
>> things for remote filesystems. Furthermore, bdrv_co_get_self_request()
>> has been moved to block/io.c.
>>
>> Applying the workaround unconditionally is fine from a performance
>> standpoint, because it should actually be dead code, thanks to patch 1
>> (the elephant in the room). As far as I know, there is no other block
>> driver but qcow2 in handle_alloc_space() that would submit zero writes
>> as part of normal I/O so it can occur concurrently to other write
>> requests. It still makes sense to take the workaround for file-posix
>> because we can’t really prevent that any other block driver will submit
>> zero writes as part of normal I/O in the future.
>>
>> Anyway, let’s get to the elephant.
>>
>> From input by XFS developers
>> (https://bugzilla.redhat.com/show_bug.cgi?id=1765547#c7) it seems clear
>> that c8bb23cbdbe causes fundamental performance problems on XFS with
>> aio=native that cannot be fixed. In other cases, c8bb23cbdbe improves
>> performance or we wouldn’t have it.
>>
>> In general, avoiding performance regressions is more important than
>> improving performance, unless the regressions are just a minor corner
>> case or insignificant when compared to the improvement. The XFS
>> regression is no minor corner case, and it isn’t insignificant. Laurent
>> Vivier has found performance to decrease by as much as 88 % (on ppc64le,
>> fio in a guest with 4k blocks, iodepth=8: 1662 kB/s from 13.9 MB/s).
>
> Ah, crap.
>
> I wanted to send this series as early today as possible to get as much
> feedback as possible, so I’ve only started doing benchmarks now.
>
> The obvious
>
> $ qemu-img bench -t none -n -w -S 65536 test.qcow2
>
> on XFS takes like 6 seconds on master, and like 50 to 80 seconds with
> c8bb23cbdbe reverted. So now on to guest tests...
Aha, that's very interesting) What about aio-native which should be slowed down?
Could it be tested like this?
>
> (Well, and the question is still where the performance regression on
> ppc64 comes from.)
>
> Max
>
>> Thus, I believe we should revert the commit for now (and most
>> importantly for 4.1.1). We can think about reintroducing it for 5.0,
>> but that would require more extensive benchmarks on a variety of
>> systems, and we must see how subclusters change the picture.
>>
>>
>> I would have liked to do benchmarks myself before making this decision,
>> but as far as I’m informed, patches for 4.1.1 are to be collected on
>> Monday, so we need to be quick.
>>
>>
>> git-backport-diff against the RFC:
>>
>> Key:
>> [----] : patches are identical
>> [####] : number of functional differences between upstream/downstream patch
>> [down] : patch is downstream-only
>> The flags [FC] indicate (F)unctional and (C)ontextual differences,
>> respectively
>>
>> 001/4:[down] 'Revert "qcow2: skip writing zero buffers to empty COW areas"'
>> 002/4:[----] [-C] 'block: Make wait/mark serialising requests public'
>> 003/4:[down] 'block: Add bdrv_co_get_self_request()'
>> 004/4:[0036] [FC] 'block/file-posix: Let post-EOF fallocate serialize'
>>
>>
>> Max Reitz (4):
>> Revert "qcow2: skip writing zero buffers to empty COW areas"
>> block: Make wait/mark serialising requests public
>> block: Add bdrv_co_get_self_request()
>> block/file-posix: Let post-EOF fallocate serialize
>>
>> qapi/block-core.json | 4 +-
>> block/qcow2.h | 6 ---
>> include/block/block_int.h | 4 ++
>> block/file-posix.c | 38 +++++++++++++++++
>> block/io.c | 42 +++++++++++++------
>> block/qcow2-cluster.c | 2 +-
>> block/qcow2.c | 86 --------------------------------------
>> block/trace-events | 1 -
>> tests/qemu-iotests/060 | 7 +---
>> tests/qemu-iotests/060.out | 5 +--
>> 10 files changed, 76 insertions(+), 119 deletions(-)
>>
>
>
--
Best regards,
Vladimir
- [PATCH for-4.2 1/4] Revert "qcow2: skip writing zero buffers to empty COW areas", (continued)
[PATCH for-4.2 2/4] block: Make wait/mark serialising requests public, Max Reitz, 2019/11/01
[PATCH for-4.2 3/4] block: Add bdrv_co_get_self_request(), Max Reitz, 2019/11/01
[PATCH for-4.2 4/4] block/file-posix: Let post-EOF fallocate serialize, Max Reitz, 2019/11/01
Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Max Reitz, 2019/11/01
- Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS,
Vladimir Sementsov-Ogievskiy <=
- Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Max Reitz, 2019/11/01
- Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Vladimir Sementsov-Ogievskiy, 2019/11/01
- Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Max Reitz, 2019/11/01
- Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Max Reitz, 2019/11/01
- Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Vladimir Sementsov-Ogievskiy, 2019/11/01
- Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Denis Lunev, 2019/11/01
- Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Max Reitz, 2019/11/01
Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Max Reitz, 2019/11/01
Re: [PATCH for-4.2 0/4] qcow2: Fix data corruption on XFS, Max Reitz, 2019/11/01