qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 14/14] migration: Fix return-path thread exit


From: Cédric Le Goater
Subject: Re: [RFC PATCH 14/14] migration: Fix return-path thread exit
Date: Mon, 12 Feb 2024 17:04:28 +0100
User-agent: Mozilla Thunderbird

Hello Peter

On 2/8/24 06:57, Peter Xu wrote:
On Wed, Feb 07, 2024 at 02:33:47PM +0100, Cédric Le Goater wrote:
In case of error, close_return_path_on_source() can perform a shutdown
to exit the return-path thread.  However, in migrate_fd_cleanup(),
'to_dst_file' is closed before calling close_return_path_on_source()
and the shutdown fails, leaving the source and destination waiting for
an event to occur.

Close the file after calling close_return_path_on_source() so that the
shutdown succeeds and the return-path thread exits.

Signed-off-by: Cédric Le Goater <clg@redhat.com>
---

  This is an RFC because the correct fix implies reworking the QEMUFile
  construct, built on top of the QEMU I/O channel.

  migration/migration.c | 13 ++++++-------
  1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 
5f55af3d7624750ca416c4177781241b3e291e5d..de329f2c553288935d824748286e79e535929b8b
 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1313,6 +1313,8 @@ void migrate_set_state(int *state, int old_state, int 
new_state)
static void migrate_fd_cleanup(MigrationState *s)
  {
+    QEMUFile *tmp = NULL;
+
      g_free(s->hostname);
      s->hostname = NULL;
      json_writer_free(s->vmdesc);
@@ -1321,8 +1323,6 @@ static void migrate_fd_cleanup(MigrationState *s)
      qemu_savevm_state_cleanup();
if (s->to_dst_file) {
-        QEMUFile *tmp;
-
          trace_migrate_fd_cleanup();
          bql_unlock();
          if (s->migration_thread_running) {
@@ -1341,15 +1341,14 @@ static void migrate_fd_cleanup(MigrationState *s)
           * critical section won't block for long.
           */
          migration_ioc_unregister_yank_from_file(tmp);
-        qemu_fclose(tmp);
      }
- /*
-     * We already cleaned up to_dst_file, so errors from the return
-     * path might be due to that, ignore them.
-     */
      close_return_path_on_source(s);
+ if (tmp) {
+        qemu_fclose(tmp);
+    }
+
      assert(!migration_is_active(s));
if (s->state == MIGRATION_STATUS_CANCELLING) {

I think this is okay to me for a short term plan.  I'll see how others
think, also add Dan into the loop.

If so, would you please add a rich comment explaining why tmp needs to be
closed later?  Especially, explicit comment on the ordering requirement
would be helpful: IMHO here it's an order that qemu_fclose() must happen
after close_return_path_on_source().  So when others work on this code we
don't easily break it without noticing.

Sure. I will when we have clarified with Fabiano what is the best
approach.

Also please feel free to post separately on migration patches if you'd like
us to merge the patches when repost.

This series is a collection of multiple (related) changes :

* extra Error** parameter to save_setup() migration handlers.
  This change has consequences on the various callers which are not
  fully analyzed.
* similar changes for memory logging handlers. These looks more self
  contained and I will see if I can send then separately.
* return-path thread termination

and then, in background we have open questions regarding :

* the QEMUfile implementation and its QIOChannel usage for migration
  streams
* qemu_file_set_error* vs. migrate_set_error. It is confusing, at least
  for me. Do we have some documentation on best practices ?

Thanks,

C.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]