[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues
From: |
Rafael David Tinoco |
Subject: |
[Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues |
Date: |
Tue, 10 Sep 2019 23:15:00 -0300 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 |
Paolo,
While debugging hungs in ARM64 while doing a simple:
qemu-img convert -f qcow2 -O qcow2 file.qcow2 output.qcow2
I might have found 2 issues which I'd like you to review, if possible.
ISSUE #1
========
I've caught the following stack trace after an HUNG in qemu-img convert:
(gdb) bt
#0 syscall ()
#1 0x0000aaaaaabd41cc in qemu_futex_wait
#2 qemu_event_wait (ev=ev@entry=0xaaaaaac86ce8 <rcu_call_ready_event>)
#3 0x0000aaaaaabed05c in call_rcu_thread
#4 0x0000aaaaaabd34c8 in qemu_thread_start
#5 0x0000ffffbf25c880 in start_thread
#6 0x0000ffffbf1b6b9c in thread_start ()
(gdb) print rcu_call_ready_event
$4 = {value = 4294967295, initialized = true}
value INT_MAX (4294967295) seems WRONG for qemu_futex_wait():
- EV_BUSY, being -1, and passed as an argument qemu_futex_wait(void *,
unsigned), is a two's complement, making argument into a INT_MAX when
that's not what is expected (unless I missed something).
*** If that is the case, unsure if you, Paolo, prefer declaring
*(QemuEvent)->value as an integer or changing EV_BUSY to "2" would okay
here ***
BUG: description:
https://bugs.launchpad.net/qemu/+bug/1805256/comments/15
========
ISSUE #2
========
I found this when debugging lockups while in futex() in a specific ARM64
server - https://bugs.launchpad.net/qemu/+bug/1805256 - which I'm still
investigating.
After fixing the issue above, I'm still getting stuck into:
qemu_event_wait() -> qemu_futex_wait()
***
As if qemu_event_set() has ran before qemu_futex_wait() ever started running
***
The Other threads are waiting for poll() on a PIPE coming from this
stuck thread (thread #1), and in sigwait():
(gdb) thread 1
...
(gdb) bt
#0 0x0000ffffbf1ad81c in __GI_ppoll
#1 0x0000aaaaaabcf73c in ppoll
#2 qemu_poll_ns
#3 0x0000aaaaaabd0764 in os_host_main_loop_wait
#4 main_loop_wait
...
(gdb) thread 2
...
(gdb) bt
#0 syscall ()
#1 0x0000aaaaaabd41cc in qemu_futex_wait
#2 qemu_event_wait (ev=ev@entry=0xaaaaaac86ce8 <rcu_call_ready_event>)
#3 0x0000aaaaaabed05c in call_rcu_thread
#4 0x0000aaaaaabd34c8 in qemu_thread_start
#5 0x0000ffffbf25c880 in start_thread
#6 0x0000ffffbf1b6b9c in thread_start ()
(gdb) thread 3
...
(gdb) bt
#0 0x0000ffffbf11aa20 in __GI___sigtimedwait
#1 0x0000ffffbf2671b4 in __sigwait
#2 0x0000aaaaaabd1ddc in sigwait_compat
#3 0x0000aaaaaabd34c8 in qemu_thread_start
#4 0x0000ffffbf25c880 in start_thread
#5 0x0000ffffbf1b6b9c in thread_start
QUESTION:
- Should qemu_event_set() check return code from
qemu_futex_wake()->qemu_futex()->syscall() in order to know if ANY
waiter was ever woken up ? Maybe even loop until at least 1 is awaken ?
Tks in advance,
Rafael D. Tinoco
- [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues,
Rafael David Tinoco <=