On 1/12/25 22:26, Phil Dennis-Jordan wrote:
> By changing the way the main QEMU event loop is invoked, I inadvertently
> changed the BQL status of exit notifiers: some of them implicitly
> assumed they would be called with the BQL held; the BQL is however
> not held during the exit(status) call in qemu_default_main().
>
> Instead of attempting to ensuring we always call exit() from the BQL -
> including any transitive calls - this change adds a BQL lock guard to
> qemu_run_exit_notifiers, ensuring the BQL will always be held in the
> exit notifiers.
>
> Additionally, the BQL promise is now documented at the
> qemu_{add,remove}_exit_notifier() declarations.
>
> Fixes: f5ab12caba4f ("ui & main loop: Redesign of system-specific main
> thread event handling")
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2771
> Reported-by: David Woodhouse <dwmw2@infradead.org>
> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
I'm worried that this breaks for exit() calls that happen within a
BQL-taken area (for example, anything that uses error_fatal) due to...
void bql_lock_impl(const char *file, int line)
{
QemuMutexLockFunc bql_lock_fn = qatomic_read(&bql_mutex_lock_func);
g_assert(!bql_locked()); // <--- this
bql_lock_fn(&bql, file, line);
set_bql_locked(true);
}
BQL_LOCK_GUARD expands to a call to bql_auto_lock(), which in turn defends against recursive locking by checking bql_locked().
I think that should make it safe?
The only safety issue I can imagine is that exit() is called in a thread where the BQL is not held, but a BQL-holding thread is waiting for that thread. But I’m not sure such a pattern exists in QEMU though, and it would have triggered the assertion in the original code. (before my patch causing the regression was applied)