[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 3/3] accel/tcg: Implement cpu_exec_reset_hold() on user emula
From: |
Ilya Leoshkevich |
Subject: |
Re: [PATCH 3/3] accel/tcg: Implement cpu_exec_reset_hold() on user emulation |
Date: |
Tue, 14 Jan 2025 21:52:03 +0100 |
User-agent: |
Evolution 3.52.4 (3.52.4-2.fc40) |
On Fri, 2025-01-10 at 00:43 +0100, Ilya Leoshkevich wrote:
> On Thu, 2025-01-02 at 19:25 +0100, Philippe Mathieu-Daudé wrote:
> > Commit bb6cf6f0168 ("accel/tcg: Factor tcg_cpu_reset_hold()
> > out") wanted to restrict tlb_flush() to system emulation,
> > but inadvertently also restricted tcg_flush_jmp_cache(),
> > which was before called on user emulation via:
> >
> > Realize -> Reset -> cpu_common_reset_hold()
> >
> > Since threads (vCPUs) use a common CPUJumpCache, when many
> > threads are created / joined, they eventually end re-using
> > a CPUJumpCache entry, which was cleared when the first vCPU
> > was allocated (via Realize) but then stayed dirty, leading to:
>
> How are jump caches shared between qemu-user vCPUs?
> I found the following, but this looks private and zeroed out
> during initialization:
>
> bool tcg_exec_realizefn(CPUState *cpu, Error **errp)
> [...]
> cpu->tb_jmp_cache = g_new0(CPUJumpCache, 1);
>
> I was also wondering whether vCPUs themselves may be recycled, but
> it doesn't seem to be the case, since do_fork() -> cpu_copy() ->
> cpu_create() -> object_new() -> object_new_with_type() calls
> g_malloc().
>
>
>
> Btw, I tried to reproduce the original issue, but bumped into
> something
> seemingly unrelated. To make matters worse, debugging seems to be
> broken, so it may take some time before I can properly test this
> change.
>
> Thread 2 received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 37607.37622]
> 0x000002aa00a6a64c in cs_option (ud=140251083477344,
> type=CS_OPT_SYNTAX, value=2) at capstone/cs.c:782
> 782 return arch_configs[handle->arch].arch_option(handle,
> type, value);
> (gdb) info threads
> Ignoring packet error, continuing...
With a small debugger fix [1] I finally managed to investigate and fix
the crash, which turned out to be not caused by QEMU [2], and with that
the testsuite ran without further issues. So I don't seem to be able to
reproduce the original issue to verify this series.
[1]
20250113134658.68376-1-iii@linux.ibm.com/">https://lore.kernel.org/qemu-devel/20250113134658.68376-1-iii@linux.ibm.com/
[2] https://github.com/capstone-rust/capstone-rs/pull/166