[Qemu-commits] [qemu/qemu] 890cbc: nbd: Fix large trim/zero requests

qemu-commits
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-commits] [qemu/qemu] 890cbc: nbd: Fix large trim/zero requests

From:	Peter Maydell
Subject:	[Qemu-commits] [qemu/qemu] 890cbc: nbd: Fix large trim/zero requests
Date:	Tue, 28 Jul 2020 14:00:33 -0700
  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 890cbccb089db9e646cc1baea3be9dc060e3917b
      
https://github.com/qemu/qemu/commit/890cbccb089db9e646cc1baea3be9dc060e3917b
  Author: Eric Blake <eblake@redhat.com>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M nbd/server.c

  Log Message:
  -----------
  nbd: Fix large trim/zero requests

Although qemu as NBD client limits requests to <2G, the NBD protocol
allows clients to send requests almost all the way up to 4G.  But
because our block layer is not yet 64-bit clean, we accidentally wrap
such requests into a negative size, and fail with EIO instead of
performing the intended operation.

The bug is visible in modern systems with something as simple as:

$ qemu-img create -f qcow2 /tmp/image.img 5G
$ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img
$ sudo blkdiscard /dev/nbd0

or with user-space only:

$ truncate --size=3G file
$ qemu-nbd -f raw file
$ nbdsh -u nbd://localhost:10809 -c 'h.trim(3*1024*1024*1024,0)'

Although both blk_co_pdiscard and blk_pwrite_zeroes currently return 0
on success, this is also a good time to fix our code to a more robust
paradigm that treats all non-negative values as success.

Alas, our iotests do not currently make it easy to add external
dependencies on blkdiscard or nbdsh, so we have to rely on manual
testing for now.

This patch can be reverted when we later improve the overall block
layer to be 64-bit clean, but for now, a minimal fix was deemed less
risky prior to release.

CC: qemu-stable@nongnu.org
Fixes: 1f4d6d18ed
Fixes: 1c6c4bb7f0
Fixes: https://github.com/systemd/systemd/issues/16242
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200722212231.535072-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: rework success tests to use >=0]


  Commit: a2b333c01880f56056d50c238834d62e32001e54
      
https://github.com/qemu/qemu/commit/a2b333c01880f56056d50c238834d62e32001e54
  Author: Nir Soffer <nirsof@gmail.com>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M block/nbd.c
    M qemu-io-cmds.c

  Log Message:
  -----------
  block: nbd: Fix convert qcow2 compressed to nbd

When converting to qcow2 compressed format, the last step is a special
zero length compressed write, ending in a call to bdrv_co_truncate(). This
call always fails for the nbd driver since it does not implement
bdrv_co_truncate().

For block devices, which have the same limits, the call succeeds since
the file driver implements bdrv_co_truncate(). If the caller asked to
truncate to the same or smaller size with exact=false, the truncate
succeeds. Implement the same logic for nbd.

Example failing without this change:

In one shell start qemu-nbd:

$ truncate -s 1g test.tar
$ qemu-nbd --socket=/tmp/nbd.sock --persistent --format=raw --offset 1536 
test.tar

In another shell convert an image to qcow2 compressed via NBD:

$ echo "disk data" > disk.raw
$ truncate -s 1g disk.raw
$ qemu-img convert -f raw -O qcow2 -c disk1.raw 
nbd+unix:///?socket=/tmp/nbd.sock; echo $?
1

qemu-img failed, but the conversion was successful:

$ qemu-img info nbd+unix:///?socket=/tmp/nbd.sock
image: nbd+unix://?socket=/tmp/nbd.sock
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
...

$ qemu-img check nbd+unix:///?socket=/tmp/nbd.sock
No errors were found on the image.
1/16384 = 0.01% allocated, 100.00% fragmented, 100.00% compressed clusters
Image end offset: 393216

$ qemu-img compare disk.raw nbd+unix:///?socket=/tmp/nbd.sock
Images are identical.

Fixes: https://bugzilla.redhat.com/1860627
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20200727215846.395443-2-nsoffer@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: typo fixes]
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: b7719bcad2e92bab5aae3166fa9011f127e6ee49
      
https://github.com/qemu/qemu/commit/b7719bcad2e92bab5aae3166fa9011f127e6ee49
  Author: Nir Soffer <nirsof@gmail.com>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M tests/qemu-iotests/264
    M tests/qemu-iotests/264.out
    M tests/qemu-iotests/iotests.py

  Log Message:
  -----------
  iotests: Make qemu_nbd_popen() a contextmanager

Instead of duplicating the code to wait until the server is ready and
remember to terminate the server and wait for it, make it possible to
use like this:

    with qemu_nbd_popen('-k', sock, image):
        # Access image via qemu-nbd socket...

Only test 264 used this helper, but I had to modify the output since it
did not consistently when starting and stopping qemu-nbd.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20200727215846.395443-3-nsoffer@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: 4b914b01cdb95fe5722cb70826db7f0a262e8d2a
      
https://github.com/qemu/qemu/commit/4b914b01cdb95fe5722cb70826db7f0a262e8d2a
  Author: Nir Soffer <nirsof@gmail.com>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M tests/qemu-iotests/iotests.py

  Log Message:
  -----------
  iotests: Add more qemu_img helpers

Add 2 helpers for measuring and checking images:
- qemu_img_measure()
- qemu_img_check()

Both use --output-json and parse the returned json to make easy to use
in other tests. I'm going to use them in a new test, and I hope they
will be useful in may other tests.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20200727215846.395443-4-nsoffer@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: 03a970bb6ffe17a3ae77d54d4127330cf9a73587
      
https://github.com/qemu/qemu/commit/03a970bb6ffe17a3ae77d54d4127330cf9a73587
  Author: Nir Soffer <nirsof@gmail.com>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    A tests/qemu-iotests/302
    A tests/qemu-iotests/302.out
    M tests/qemu-iotests/group

  Log Message:
  -----------
  iotests: Test convert to qcow2 compressed to NBD

Add test for "qemu-img convert -O qcow2 -c" to NBD target. The tests    
create a OVA file and write compressed qcow2 disk content directly into
the OVA file via qemu-nbd.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20200727215846.395443-5-nsoffer@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: fa35591b9cb9a7fd0af2d8c2d8848abba30d3c69
      
https://github.com/qemu/qemu/commit/fa35591b9cb9a7fd0af2d8c2d8848abba30d3c69
  Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M block/nbd.c
    M block/trace-events

  Log Message:
  -----------
  block/nbd: split nbd_establish_connection out of nbd_client_connect

We are going to implement non-blocking version of
nbd_establish_connection, which for a while will be used only for
nbd_reconnect_attempt, not for nbd_open, so we need to call it
separately.

Refactor nbd_reconnect_attempt in a way which makes next commit
simpler.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200727184751.15704-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: dd1ec1a4afe190e030edfa052d95c9e6e065438c
      
https://github.com/qemu/qemu/commit/dd1ec1a4afe190e030edfa052d95c9e6e065438c
  Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M block/nbd.c

  Log Message:
  -----------
  block/nbd: allow drain during reconnect attempt

It should be safe to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:

1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
    echo $n; ((n++));
    ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
    echo $n; ((n++));
    ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000;
done

After some time, qemu-io will hang trying to drain, for example, like
this:

 #3 aio_poll (ctx=0x55f006bdd890, blocking=true) at
    util/aio-posix.c:600
 #4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false,
    parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427
 #5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433
 #6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710
 #7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498
 #8 bdrv_open_inherit (filename=0x7fffba1563bc
    "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x55f006be86d0,
    flags=24578, parent=0x0, child_class=0x0, child_role=0,
    errp=0x7fffba154620) at block.c:3491
 #9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000",
    reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
    block.c:3513
 #10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000",
    reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
    block/block-backend.c:421

And connection_co stack like this:

 #0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918,
    action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302
 #1 qemu_coroutine_yield () at util/qemu-coroutine.c:193
 #2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at
    io/channel.c:472
 #3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
    niov=1, errp=0x7fe96d729eb0) at io/channel.c:110
 #4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
    niov=1, errp=0x7fe96d729eb0) at io/channel.c:143
 #5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28
    "\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at
    io/channel.c:247
 #6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8,
    desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
    /work/src/qemu/master/include/block/nbd.h:365
 #7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28,
    desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
    /work/src/qemu/master/include/block/nbd.h:391
 #8 nbd_start_negotiate (aio_context=0x55f006bdd890,
    ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
    outioc=0x55f006bf19f8, structured_reply=true,
    zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904
 #9 nbd_receive_negotiate (aio_context=0x55f006bdd890,
    ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
    outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at
    nbd/client.c:1032
 #10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at
    block/nbd.c:1460
 #11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287
 #12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309
 #13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360
 #14 coroutine_trampoline (i0=113190480, i1=22000) at
    util/coroutine-ucontext.c:173

Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200727184751.15704-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: fbeb3e63b34a1af4a968031de1c82e5edf20bf6c
      
https://github.com/qemu/qemu/commit/fbeb3e63b34a1af4a968031de1c82e5edf20bf6c
  Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M block/nbd.c

  Log Message:
  -----------
  block/nbd: on shutdown terminate connection attempt

On shutdown nbd driver may be in a connecting state. We should shutdown
it as well, otherwise we may hang in
nbd_teardown_connection, waiting for conneciton_co to finish in
BDRV_POLL_WHILE(bs, s->connection_co) loop if remote server is down.

How to reproduce the dead lock:

1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
    echo $n; ((n++));
    ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
    echo $n; ((n++));
    ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000;
done

After some time, qemu-io will hang. Note, that this hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: allow drain during reconnect attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727184751.15704-4-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: 12c75e20a269ac917f4a76936d7142264e522233
      
https://github.com/qemu/qemu/commit/12c75e20a269ac917f4a76936d7142264e522233
  Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M block/nbd.c

  Log Message:
  -----------
  block/nbd: nbd_co_reconnect_loop(): don't sleep if drained

We try to go to wakeable sleep, so that, if drain begins it will break
the sleep. But what if nbd_client_co_drain_begin() already called and
s->drained is already true? We'll go to sleep, and drain will have to
wait for the whole timeout. Let's improve it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200727184751.15704-5-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: 5045be872db97be2aecc694bf43762e75e7e5395
      
https://github.com/qemu/qemu/commit/5045be872db97be2aecc694bf43762e75e7e5395
  Author: Peter Maydell <peter.maydell@linaro.org>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M block/nbd.c
    M block/trace-events
    M nbd/server.c
    M qemu-io-cmds.c
    M tests/qemu-iotests/264
    M tests/qemu-iotests/264.out
    A tests/qemu-iotests/302
    A tests/qemu-iotests/302.out
    M tests/qemu-iotests/group
    M tests/qemu-iotests/iotests.py

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2020-07-28' into 
staging

nbd patches for 2020-07-28

- fix NBD handling of trim/zero requests larger than 2G
- allow no-op resizes on NBD (in turn fixing qemu-img convert -c into NBD)
- several deadlock fixes when using NBD reconnect

# gpg: Signature made Tue 28 Jul 2020 15:59:42 BST
# gpg:                using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg:                 aka "Eric Blake (Free Software Programmer) 
<ebb9@byu.net>" [full]
# gpg:                 aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2020-07-28:
  block/nbd: nbd_co_reconnect_loop(): don't sleep if drained
  block/nbd: on shutdown terminate connection attempt
  block/nbd: allow drain during reconnect attempt
  block/nbd: split nbd_establish_connection out of nbd_client_connect
  iotests: Test convert to qcow2 compressed to NBD
  iotests: Add more qemu_img helpers
  iotests: Make qemu_nbd_popen() a contextmanager
  block: nbd: Fix convert qcow2 compressed to nbd
  nbd: Fix large trim/zero requests

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>


  Commit: 5772f2b1fc5d00e7e04e01fa28e9081d6550440a
      
https://github.com/qemu/qemu/commit/5772f2b1fc5d00e7e04e01fa28e9081d6550440a
  Author: Peter Maydell <peter.maydell@linaro.org>
  Date:   2020-07-28 (Tue, 28 Jul 2020)

  Changed paths:
    M VERSION

  Log Message:
  -----------
  Update version for v5.1.0-rc2 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>


Compare: https://github.com/qemu/qemu/compare/b1753831b0e5...5772f2b1fc5d
[Prev in Thread]
Current Thread
[Next in Thread]
[Qemu-commits] [qemu/qemu] 890cbc: nbd: Fix large trim/zero requests, Peter Maydell <=
Prev by Date: [Qemu-commits] [qemu/qemu]
Next by Date: [Qemu-commits] [qemu/qemu] 1b7157: trace/simple: Allow enabling simple traces from co...
Previous by thread: [Qemu-commits] [qemu/qemu] 297641: test-char: abort on serial test error
Next by thread: [Qemu-commits] [qemu/qemu] 1b7157: trace/simple: Allow enabling simple traces from co...
Index(es):
- Date
- Thread