qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2 1/6] migration/multifd: Remove channels_ready semaphor


From: Juan Quintela
Subject: Re: [RFC PATCH v2 1/6] migration/multifd: Remove channels_ready semaphore
Date: Thu, 19 Oct 2023 11:06:06 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.3 (gnu/linux)

Fabiano Rosas <farosas@suse.de> wrote:
> The channels_ready semaphore is a global variable not linked to any
> single multifd channel. Waiting on it only means that "some" channel
> has become ready to send data. Since we need to address the channels
> by index (multifd_send_state->params[i]), that information adds
> nothing of value.

NAK.

I disagree here O:-)

the reason why that channel exist is for multifd_send_pages()

And simplifying the function what it does is:

sem_wait(channels_ready);

for_each_channel()
   look if it is empty()

But with the semaphore, we guarantee that when we go to the loop, there
is a channel ready, so we know we donat busy wait searching for a
channel that is free.

Notice that I fully agree that the sem is not needed for locking.
Locking is done with the mutex.  It is just used to make sure that we
don't busy loop on that loop.

And we use a sem, because it is the easiest way to know how many
channels are ready (even when we only care if there is one when we
arrive to that code).

We lost count of that counter, and we fixed that here:

commit d2026ee117147893f8d80f060cede6d872ecbd7f
Author: Juan Quintela <quintela@redhat.com>
Date:   Wed Apr 26 12:20:36 2023 +0200

    multifd: Fix the number of channels ready

    We don't wait in the sem when we are doing a sync_main.  Make it

And we were addressing the problem that some users where finding that we
were busy waiting on that loop.

> The channel being addressed is not necessarily the
> one that just released the semaphore.

We only care that there is at least one free.  We are going to search
the next one.

Does this explanation makes sense?

Later, Juan.

> The only usage of this semaphore that makes sense is to wait for it in
> a loop that iterates for the number of channels. That could mean: all
> channels have been setup and are operational OR all channels have
> finished their work and are idle.
>
> Currently all code that waits on channels_ready is redundant. There is
> always a subsequent lock or semaphore that does the actual data
> protection/synchronization.
>
> - at multifd_send_pages: Waiting on channels_ready doesn't mean the
>   'next_channel' is ready, it could be any other channel. So there are
>   already cases where this code runs as if no semaphore was there.

>   Waiting outside of the loop is also incorrect because if the current
>   channel already has a pending_job, then it will loop into the next
>   one without waiting the semaphore and the count will be greater than
>   zero at the end of the execution.
>
>   Checking that "any" channel is ready as a proxy for all channels
>   being ready would work, but it's not what the code is doing and not
>   really needed because the channel lock and 'sem' would be enough.
>
> - at multifd_send_sync: This usage is correct, but it is made
>   redundant by the wait on sem_sync. What this piece of code is doing
>   is making sure all channels have sent the SYNC packet and became
>   idle afterwards.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  migration/multifd.c | 10 ----------
>  1 file changed, 10 deletions(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 0f6b203877..e26f5f246d 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -362,8 +362,6 @@ struct {
>      MultiFDPages_t *pages;
>      /* global number of generated multifd packets */
>      uint64_t packet_num;
> -    /* send channels ready */
> -    QemuSemaphore channels_ready;
>      /*
>       * Have we already run terminate threads.  There is a race when it
>       * happens that we got one error while we are exiting.
> @@ -403,7 +401,6 @@ static int multifd_send_pages(QEMUFile *f)
>          return -1;
>      }
>  
> -    qemu_sem_wait(&multifd_send_state->channels_ready);
>      /*
>       * next_channel can remain from a previous migration that was
>       * using more channels, so ensure it doesn't overflow if the
> @@ -554,7 +551,6 @@ void multifd_save_cleanup(void)
>              error_free(local_err);
>          }
>      }
> -    qemu_sem_destroy(&multifd_send_state->channels_ready);
>      g_free(multifd_send_state->params);
>      multifd_send_state->params = NULL;
>      multifd_pages_clear(multifd_send_state->pages);
> @@ -630,7 +626,6 @@ int multifd_send_sync_main(QEMUFile *f)
>      for (i = 0; i < migrate_multifd_channels(); i++) {
>          MultiFDSendParams *p = &multifd_send_state->params[i];
>  
> -        qemu_sem_wait(&multifd_send_state->channels_ready);
>          trace_multifd_send_sync_main_wait(p->id);
>          qemu_sem_wait(&p->sem_sync);
>  
> @@ -664,7 +659,6 @@ static void *multifd_send_thread(void *opaque)
>      p->num_packets = 1;
>  
>      while (true) {
> -        qemu_sem_post(&multifd_send_state->channels_ready);
>          qemu_sem_wait(&p->sem);
>  
>          if (qatomic_read(&multifd_send_state->exiting)) {
> @@ -759,7 +753,6 @@ out:
>       */
>      if (ret != 0) {
>          qemu_sem_post(&p->sem_sync);
> -        qemu_sem_post(&multifd_send_state->channels_ready);
>      }
>  
>      qemu_mutex_lock(&p->mutex);
> @@ -796,7 +789,6 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
>           * is not created, and then tell who pay attention to me.
>           */
>          p->quit = true;
> -        qemu_sem_post(&multifd_send_state->channels_ready);
>          qemu_sem_post(&p->sem_sync);
>      }
>  }
> @@ -874,7 +866,6 @@ static void 
> multifd_new_send_channel_cleanup(MultiFDSendParams *p,
>  {
>       migrate_set_error(migrate_get_current(), err);
>       /* Error happen, we need to tell who pay attention to me */
> -     qemu_sem_post(&multifd_send_state->channels_ready);
>       qemu_sem_post(&p->sem_sync);
>       /*
>        * Although multifd_send_thread is not created, but main migration
> @@ -919,7 +910,6 @@ int multifd_save_setup(Error **errp)
>      multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
>      multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
>      multifd_send_state->pages = multifd_pages_init(page_count);
> -    qemu_sem_init(&multifd_send_state->channels_ready, 0);
>      qatomic_set(&multifd_send_state->exiting, 0);
>      multifd_send_state->ops = multifd_ops[migrate_multifd_compression()];




reply via email to

[Prev in Thread] Current Thread [Next in Thread]