Re: [PATCH] system/runstate: Fix regression, clarify BQL status of exit

On Wed 15. Jan 2025 at 20:05, Paolo Bonzini <pbonzini@redhat.com> wrote:

On 1/12/25 22:26, Phil Dennis-Jordan wrote:
> By changing the way the main QEMU event loop is invoked, I inadvertently
> changed the BQL status of exit notifiers: some of them implicitly
> assumed they would be called with the BQL held; the BQL is however
> not held during the exit(status) call in qemu_default_main().
>
> Instead of attempting to ensuring we always call exit() from the BQL -
> including any transitive calls - this change adds a BQL lock guard to
> qemu_run_exit_notifiers, ensuring the BQL will always be held in the
> exit notifiers.
>
> Additionally, the BQL promise is now documented at the
> qemu_{add,remove}_exit_notifier() declarations.
>
> Fixes: f5ab12caba4f ("ui & main loop: Redesign of system-specific main
> thread event handling")
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2771
> Reported-by: David Woodhouse <dwmw2@infradead.org>
> Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>

I'm worried that this breaks for exit() calls that happen within a
BQL-taken area (for example, anything that uses error_fatal) due to...

void bql_lock_impl(const char *file, int line)
{
QemuMutexLockFunc bql_lock_fn = qatomic_read(&bql_mutex_lock_func);

g_assert(!bql_locked()); // <--- this
bql_lock_fn(&bql, file, line);
set_bql_locked(true);
}

BQL_LOCK_GUARD expands to a call to bql_auto_lock(), which in turn defends against recursive locking by checking bql_locked().

https://gitlab.com/qemu-project/qemu/-/blob/master/include/qemu/main-loop.h#L377

I think that should make it safe?

The only safety issue I can imagine is that exit() is called in a thread where the BQL is not held, but a BQL-holding thread is waiting for that thread. But I’m not sure such a pattern exists in QEMU though, and it would have triggered the assertion in the original code. (before my patch causing the regression was applied)

From:	Phil Dennis-Jordan
Subject:	Re: [PATCH] system/runstate: Fix regression, clarify BQL status of exit notifiers
Date:	Wed, 15 Jan 2025 20:17:58 +0100