qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 01/33] main-loop.h: introduce qemu_in_main_thread()


From: Paolo Bonzini
Subject: Re: [PATCH v6 01/33] main-loop.h: introduce qemu_in_main_thread()
Date: Mon, 31 Jan 2022 16:49:33 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0

On 1/31/22 15:33, Kevin Wolf wrote:
I feel "use this function if your code is going to be used by unit
tests, but use qemu_mutex_iothread_locked() otherwise" is a very strange
interface. I'm relatively sure that qemu_mutex_iothread_locked() isn't
primarily used to make unit tests crash.

qemu_mutex_iothread_locked() should never be used in the block layer, because programs that use the block layer might not call an iothread lock, and indeed only the emulator have an iothread lock.

Making it always true would be okay when those programs were single-threaded, but really they all had worker threads so it was changed to false by commit 5f50be9b58 ("async: the main AioContext is only "current" if under the BQL", 2021-06-18).

Documentation and the definition of the interface of a function
shouldn't be about the caller, but about the semantics of the function
itself. So, which question does qemu_mutex_iothread_locked() answer, and
which question does qemu_in_main_thread() answer?

qemu_mutex_iothread_locked() -> Does the program have exclusive access to the emulator's globals.

qemu_in_main_thread() -> Does the program have access to the block layer's globals. In emulators this is governed by the iothread lock, elsewhere they are accessible only from the home thread of the initial AioContext.

So, in emulators it is the same question, but in the block layer one of them is actually meaningless.

Really the "bug" is that qemu_mutex_iothread_locked() should really not be used outside emulators. There are just two uses, but it's hard to remove them.

So why are two functions needed?  Two reasons:

- stubs/iothread-lock.c cannot define qemu_mutex_iothread_locked() as "return qemu_get_current_aio_context() == qemu_get_aio_context();" because it would cause infinite recursion with qemu_get_current_aio_context()

- even if it could, frankly qemu_mutex_iothread_locked() is a horrible name, and in the context of the block layer it conflicts especially badly with the "iothread" concept.

Maybe we should embrace the BQL name and rename the functions that refer to the "iothread lock". But then I don't want to have code in the block layer that refers to a "big lock".

What are the use cases for the existing qemu_mutex_iothread_locked()
stub where we rely on false being returned?

I don't think we ever rely on stub being false. There are only 2
places where I see !qemu_mutex_iothread_locked(), and are in
helper_regs.c and cpus.c

We rely on it indirectly, here:

    AioContext *qemu_get_current_aio_context(void)
    {
        if (my_aiocontext) {
            return my_aiocontext;
        }
        if (qemu_mutex_iothread_locked()) {
            /* Possibly in a vCPU thread.  */
            return qemu_get_aio_context();
        }
        return NULL;
    }

which affects the behavior of aio_co_enter. Before that patch, worker threads computed qemu_get_aio_context(), while afterwards they compute NULL.

This is not just for unit tests but also for qemu-storage-daemon and other block layer programs.

However, commit 5f50be9b5810293141bb53cfd0cb46e765367d56 changed the
stub to return false for a specific reason.

I must admit I don't really understand the reasoning there:

     With this change, the stub qemu_mutex_iothread_locked() must be changed
     from true to false.  The previous value of true was needed because the
     main thread did not have an AioContext in the thread-local variable,
     but now it does have one.

This explains why it _can_ be changed to false for this caller, but not
why it _must_ be changed.

See above: because it returns the wrong value for all threads except one, and there are better ways to do a meaningful check in that one thread: using qemu_get_current_aio_context(), which is what aio_co_enter did at the time and qemu_in_main_thread() does with Emanuele's change.

So is the problem with the unit tests really just that they never call
qemu_init_main_loop() and therefore never set the AioContext for the
main thread?

No, most of them do (and if some are missing it we can add it).

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]