qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] d12ade: mirror: fix dead-lock


From: GitHub
Subject: [Qemu-commits] [qemu/qemu] d12ade: mirror: fix dead-lock
Date: Mon, 03 Dec 2018 09:43:02 -0800

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: d12ade5732c4d2d293735a39b4bd943da8d64db6
      
https://github.com/qemu/qemu/commit/d12ade5732c4d2d293735a39b4bd943da8d64db6
  Author: Vladimir Sementsov-Ogievskiy <address@hidden>
  Date:   2018-12-03 (Mon, 03 Dec 2018)

  Changed paths:
    M block/mirror.c

  Log Message:
  -----------
  mirror: fix dead-lock

Let start from the beginning:

Commit b9e413dd375 (in 2.9)
"block: explicitly acquire aiocontext in aio callbacks that need it"
added pairs of aio_context_acquire/release to mirror_write_complete and
mirror_read_complete, when they were aio callbacks for blk_aio_* calls.

Then, commit 2e1990b26e5 (in 3.0) "block/mirror: Convert to coroutines"
dropped these blk_aio_* calls, than mirror_write_complete and
mirror_read_complete are not callbacks more, and don't need additional
aiocontext acquiring. Furthermore, mirror_read_complete calls
blk_co_pwritev inside these pair of aio_context_acquire/release, which
leads to the following dead-lock with mirror:

 (gdb) info thr
   Id   Target Id         Frame
   3    Thread (LWP 145412) "qemu-system-x86" syscall ()
   2    Thread (LWP 145416) "qemu-system-x86" __lll_lock_wait ()
 * 1    Thread (LWP 145411) "qemu-system-x86" __lll_lock_wait ()

 (gdb) bt
 #0  __lll_lock_wait ()
 #1  _L_lock_812 ()
 #2  __GI___pthread_mutex_lock
 #3  qemu_mutex_lock_impl (mutex=0x561032dce420 <qemu_global_mutex>,
     file=0x5610327d8654 "util/main-loop.c", line=236) at
     util/qemu-thread-posix.c:66
 #4  qemu_mutex_lock_iothread_impl
 #5  os_host_main_loop_wait (timeout=480116000) at util/main-loop.c:236
 #6  main_loop_wait (nonblocking=0) at util/main-loop.c:497
 #7  main_loop () at vl.c:1892
 #8  main

Printing contents of qemu_global_mutex, I see that "__owner = 145416",
so, thr1 is main loop, and now it wants BQL, which is owned by thr2.

 (gdb) thr 2
 (gdb) bt
 #0  __lll_lock_wait ()
 #1  _L_lock_870 ()
 #2  __GI___pthread_mutex_lock
 #3  qemu_mutex_lock_impl (mutex=0x561034d25dc0, ...
 #4  aio_context_acquire (ctx=0x561034d25d60)
 #5  dma_blk_cb
 #6  dma_blk_io
 #7  dma_blk_read
 #8  ide_dma_cb
 #9  bmdma_cmd_writeb
 #10 bmdma_write
 #11 memory_region_write_accessor
 #12 access_with_adjusted_size
 #15 flatview_write
 #16 address_space_write
 #17 address_space_rw
 #18 kvm_handle_io
 #19 kvm_cpu_exec
 #20 qemu_kvm_cpu_thread_fn
 #21 qemu_thread_start
 #22 start_thread
 #23 clone ()

Printing mutex in fr 2, I see "__owner = 145411", so thr2 wants aio
context mutex, which is owned by thr1. Classic dead-lock.

Then, let's check that aio context is hold by mirror coroutine: just
print coroutine stack of first tracked request in mirror job target:

 (gdb) [...]
 (gdb) qemu coroutine 0x561035dd0860
 #0  qemu_coroutine_switch
 #1  qemu_coroutine_yield
 #2  qemu_co_mutex_lock_slowpath
 #3  qemu_co_mutex_lock
 #4  qcow2_co_pwritev
 #5  bdrv_driver_pwritev
 #6  bdrv_aligned_pwritev
 #7  bdrv_co_pwritev
 #8  blk_co_pwritev
 #9  mirror_read_complete () at block/mirror.c:232
 #10 mirror_co_read () at block/mirror.c:370
 #11 coroutine_trampoline
 #12 __start_context

Yes it is mirror_read_complete calling blk_co_pwritev after acquiring
aio context.

Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
Reviewed-by: Max Reitz <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>


  Commit: db5e8210adbafe9c6383d8364345a8c545d38e62
      
https://github.com/qemu/qemu/commit/db5e8210adbafe9c6383d8364345a8c545d38e62
  Author: Vladimir Sementsov-Ogievskiy <address@hidden>
  Date:   2018-12-03 (Mon, 03 Dec 2018)

  Changed paths:
    A tests/qemu-iotests/235
    A tests/qemu-iotests/235.out
    M tests/qemu-iotests/group

  Log Message:
  -----------
  iotests: simple mirror test with kvm on 1G image

This test is broken without previous commit fixing dead-lock in mirror.

Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
Signed-off-by: Max Reitz <address@hidden>
Acked-by: Vladimir Sementsov-Ogievskiy <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>


  Commit: 3af8c4be90e1e125e4269181c5de4f2f3fc16cb1
      
https://github.com/qemu/qemu/commit/3af8c4be90e1e125e4269181c5de4f2f3fc16cb1
  Author: Peter Maydell <address@hidden>
  Date:   2018-12-03 (Mon, 03 Dec 2018)

  Changed paths:
    M block/mirror.c
    A tests/qemu-iotests/235
    A tests/qemu-iotests/235.out
    M tests/qemu-iotests/group

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- mirror: Fix deadlock

# gpg: Signature made Mon 03 Dec 2018 16:57:33 GMT
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <address@hidden>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  iotests: simple mirror test with kvm on 1G image
  mirror: fix dead-lock

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/83ea23cd207a...3af8c4be90e1
      **NOTE:** This service has been marked for deprecation: 
https://developer.github.com/changes/2018-04-25-github-services-deprecation/

      Functionality will be removed from GitHub.com on January 31st, 2019.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]