[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: intermittent hang, s390x host, bios-tables-test test, TPM
From: |
Daniel P . Berrangé |
Subject: |
Re: intermittent hang, s390x host, bios-tables-test test, TPM |
Date: |
Tue, 10 Jan 2023 19:44:11 +0000 |
User-agent: |
Mutt/2.2.9 (2022-11-12) |
On Fri, Jan 06, 2023 at 10:16:36AM -0500, Stefan Berger wrote:
>
>
> On 1/6/23 07:10, Peter Maydell wrote:
> > I'm seeing an intermittent hang on the s390 CI runner in the
> > bios-tables-test test. It looks like we've deadlocked because:
> >
> > * the TPM device is waiting for data on its socket that never arrives,
> > and it's holding the iothread lock
> > * QEMU is therefore not making forward progress;
> > in particular it is unable to handle qtest queries/responses
> > * the test binary thread 1 is waiting to get a response to its
> > qtest command, which is not going to arrive
> > * test binary thread 3 (tpm_emu_ctrl_thread) is has hit an
> > assertion and is trying to kill QEMU via qtest_kill_qemu()
> > * qtest_kill_qemu() is only a "SIGTERM and wait", so will wait
> > forever, because QEMU won't respond to the SIGTERM while it's
> > blocked waiting for the TPM device to release the iothread lock
> > * because the ctrl-thread is waiting for QEMU to exit, it's never
> > going to send the data that would unblock the TPM device emulation
> >
> [...]
>
> >
> > Thread 3 (Thread 0x3ff8dafe900 (LWP 2661316)):
> > #0 0x000003ff8e9c6002 in __GI___wait4 (pid=<optimized out>,
> > stat_loc=stat_loc@entry=0x2aa0b42c9bc, options=<optimized out>,
> > usage=usage@entry=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
> > #1 0x000003ff8e9c5f72 in __GI___waitpid (pid=<optimized out>,
> > stat_loc=stat_loc@entry=0x2aa0b42c9bc, options=options@entry=0) at
> > waitpid.c:38
> > #2 0x000002aa0952a516 in qtest_wait_qemu (s=0x2aa0b42c9b0) at
> > ../tests/qtest/libqtest.c:206
> > #3 0x000002aa0952a58a in qtest_kill_qemu (s=0x2aa0b42c9b0) at
> > ../tests/qtest/libqtest.c:229
> > #4 0x000003ff8f0c288e in g_hook_list_invoke () from
> > /lib/s390x-linux-gnu/libglib-2.0.so.0
> > #5 <signal handler called>
> > #6 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> > #7 0x000003ff8e9240a2 in __GI_abort () at abort.c:79
> > #8 0x000003ff8f0feda8 in g_assertion_message () from
> > /lib/s390x-linux-gnu/libglib-2.0.so.0
> > #9 0x000003ff8f0fedfe in g_assertion_message_expr () from
> > /lib/s390x-linux-gnu/libglib-2.0.so.0
> > #10 0x000002aa09522904 in tpm_emu_ctrl_thread (data=0x3fff5ffa160) at
> > ../tests/qtest/tpm-emu.c:189
>
> This here seems to be the root cause. An unknown control channel command was
> received from the TPM emulator backend by the control channel thread and we
> end up in g_assert_not_reached().
>
> https://github.com/qemu/qemu/blob/master/tests/qtest/tpm-emu.c#L189
>
>
>
> ret = qio_channel_read(ioc, (char *)&cmd, sizeof(cmd), NULL);
> if (ret <= 0) {
> break;
> }
>
> cmd = be32_to_cpu(cmd);
> switch (cmd) {
> [...]
> default:
> g_debug("unimplemented %u", cmd);
> g_assert_not_reached(); <------------------
> }
>
> I will run this test case in an endless loop on an x86_64 host and see what
> we get there ...
The QEMU stack trace shows:
#7 0x000002aa1224a2ca in tpm_emulator_cancel_cmd (tb=<optimized out>)
at ../backends/tpm/tpm_emulator.c:500
#8 0x000002aa121e68c4 in tpm_tis_mmio_write (opaque=0x2aa1529ec20,
addr=24, val=64, size=<optimized out>) at
../hw/tpm/tpm_tis_common.c:663
IOW, we're getting CMD_CANCEL_TPM_CMD, which is indeed not handled
by any 'case:' in the switch in qtest/tpm-emu.c
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, (continued)
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Peter Maydell, 2023/01/06
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Stefan Berger, 2023/01/06
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Daniel P . Berrangé, 2023/01/10
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Peter Maydell, 2023/01/10
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Stefan Berger, 2023/01/10
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Daniel P . Berrangé, 2023/01/10
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Stefan Berger, 2023/01/10
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Stefan Berger, 2023/01/10
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Daniel P . Berrangé, 2023/01/11
- Re: intermittent hang, s390x host, bios-tables-test test, TPM, Stefan Berger, 2023/01/11
Re: intermittent hang, s390x host, bios-tables-test test, TPM,
Daniel P . Berrangé <=