qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] 6c9536: iothread: fix iothread hang when stop


From: Peter Maydell
Subject: [Qemu-commits] [qemu/qemu] 6c9536: iothread: fix iothread hang when stop too soon
Date: Tue, 12 Feb 2019 12:26:36 +0000 (UTC)

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 6c95363d97087e665748cf4d42fafdeb6f714e53
      
https://github.com/qemu/qemu/commit/6c95363d97087e665748cf4d42fafdeb6f714e53
  Author: Peter Xu <address@hidden>
  Date:   2019-02-12 (Tue, 12 Feb 2019)

  Changed paths:
    M iothread.c

  Log Message:
  -----------
  iothread: fix iothread hang when stop too soon

Lukas reported an hard to reproduce QMP iothread hang on s390 that
QEMU might hang at pthread_join() of the QMP monitor iothread before
quitting:

  Thread 1
  #0  0x000003ffad10932c in pthread_join
  #1  0x0000000109e95750 in qemu_thread_join
      at /home/thuth/devel/qemu/util/qemu-thread-posix.c:570
  #2  0x0000000109c95a1c in iothread_stop
  #3  0x0000000109bb0874 in monitor_cleanup
  #4  0x0000000109b55042 in main

While the iothread is still in the main loop:

  Thread 4
  #0  0x000003ffad0010e4 in ??
  #1  0x000003ffad553958 in g_main_context_iterate.isra.19
  #2  0x000003ffad553d90 in g_main_loop_run
  #3  0x0000000109c9585a in iothread_run
      at /home/thuth/devel/qemu/iothread.c:74
  #4  0x0000000109e94752 in qemu_thread_start
      at /home/thuth/devel/qemu/util/qemu-thread-posix.c:502
  #5  0x000003ffad10825a in start_thread
  #6  0x000003ffad00dcf2 in thread_start

IMHO it's because there's a race between the main thread and iothread
when stopping the thread in following sequence:

    main thread                       iothread
    ===========                       ==============
                                      aio_poll()
    iothread_get_g_main_context
      set iothread->worker_context
    iothread_stop
      schedule iothread_stop_bh
                                        execute iothread_stop_bh [1]
                                          set iothread->running=false
                                          (since main_loop==NULL so
                                           skip to quit main loop.
                                           Note: although main_loop is
                                           NULL but worker_context is
                                           not!)
                                      atomic_read(&iothread->worker_context) [2]
                                        create main_loop object
                                        g_main_loop_run() [3]
    pthread_join() [4]

We can see that when execute iothread_stop_bh() at [1] it's possible
that main_loop is still NULL because it's only created until the first
check of the worker_context later at [2].  Then the iothread will hang
in the main loop [3] and it'll starve the main thread too [4].

Here the simple solution should be that we check again the "running"
variable before check against worker_context.

CC: Thomas Huth <address@hidden>
CC: Dr. David Alan Gilbert <address@hidden>
CC: Stefan Hajnoczi <address@hidden>
CC: Lukáš Doktor <address@hidden>
CC: Markus Armbruster <address@hidden>
CC: Eric Blake <address@hidden>
CC: Paolo Bonzini <address@hidden>
Reported-by: Lukáš Doktor <address@hidden>
Signed-off-by: Peter Xu <address@hidden>
Tested-by: Thomas Huth <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: 6eaa20c8362cf8e12e023e2e05861f84cec6a438
      
https://github.com/qemu/qemu/commit/6eaa20c8362cf8e12e023e2e05861f84cec6a438
  Author: Vladimir Sementsov-Ogievskiy <address@hidden>
  Date:   2019-02-12 (Tue, 12 Feb 2019)

  Changed paths:
    M scripts/qemugdb/coroutine.py

  Log Message:
  -----------
  qemugdb/coroutine: fix arch_prctl has unknown return type

qemu coroutine command results in following error output:

Python Exception <class 'gdb.error'> 'arch_prctl' has unknown return
type; cast the call to its declared return type: Error occurred in
Python command: 'arch_prctl' has unknown return type; cast the call to
its declared return type

Fix it by giving it what it wants: arch_prctl return type.

Information on the topic:
   https://sourceware.org/gdb/onlinedocs/gdb/Calling.html

Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: 9a6719d572e99a4e79f589d0b73f7475b86f982d
      
https://github.com/qemu/qemu/commit/9a6719d572e99a4e79f589d0b73f7475b86f982d
  Author: Stefano Garzarella <address@hidden>
  Date:   2019-02-12 (Tue, 12 Feb 2019)

  Changed paths:
    M hw/block/virtio-blk.c

  Log Message:
  -----------
  virtio-blk: cleanup using VirtIOBlock *s and VirtIODevice *vdev

In several part we still using req->dev or VIRTIO_DEVICE(req->dev)
when we have already defined s and vdev pointers:
    VirtIOBlock *s = req->dev;
    VirtIODevice *vdev = VIRTIO_DEVICE(s);

Signed-off-by: Stefano Garzarella <address@hidden>
Reviewed-by: Liam Merwick <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>


  Commit: 0b5e750bea635b167eb03d86c3d9a09bbd43bc06
      
https://github.com/qemu/qemu/commit/0b5e750bea635b167eb03d86c3d9a09bbd43bc06
  Author: Peter Maydell <address@hidden>
  Date:   2019-02-12 (Tue, 12 Feb 2019)

  Changed paths:
    M hw/block/virtio-blk.c
    M iothread.c
    M scripts/qemugdb/coroutine.py

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into 
staging

Pull request

# gpg: Signature made Tue 12 Feb 2019 03:58:58 GMT
# gpg:                using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <address@hidden>" [full]
# gpg:                 aka "Stefan Hajnoczi <address@hidden>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  virtio-blk: cleanup using VirtIOBlock *s and VirtIODevice *vdev
  qemugdb/coroutine: fix arch_prctl has unknown return type
  iothread: fix iothread hang when stop too soon

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/d85e60e99380...0b5e750bea63



reply via email to

[Prev in Thread] Current Thread [Next in Thread]