qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] 713f76: migration: fix migrate_cancel leads l


From: Peter Maydell
Subject: [Qemu-commits] [qemu/qemu] 713f76: migration: fix migrate_cancel leads live_migration...
Date: Thu, 25 Jul 2019 06:45:21 -0700

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 713f762a316348b00f5a3713b5314c88ab0a5852
      
https://github.com/qemu/qemu/commit/713f762a316348b00f5a3713b5314c88ab0a5852
  Author: Ivan Ren <address@hidden>
  Date:   2019-07-24 (Wed, 24 Jul 2019)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration: fix migrate_cancel leads live_migration thread endless loop

When we 'migrate_cancel' a multifd migration, live_migration thread may
go into endless loop in multifd_send_pages functions.

Reproduce steps:

(qemu) migrate_set_capability multifd on
(qemu) migrate -d url
(qemu) [wait a while]
(qemu) migrate_cancel

Then may get live_migration 100% cpu usage in following stack:

pthread_mutex_lock
qemu_mutex_lock_impl
multifd_send_pages
multifd_queue_page
ram_save_multifd_page
ram_save_target_page
ram_save_host_page
ram_find_and_save_block
ram_find_and_save_block
ram_save_iterate
qemu_savevm_state_iterate
migration_iteration_run
migration_thread
qemu_thread_start
start_thread
clone

Signed-off-by: Ivan Ren <address@hidden>
Message-Id: <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>


  Commit: a3ec6b7d236593a95a197c229a1b673995105175
      
https://github.com/qemu/qemu/commit/a3ec6b7d236593a95a197c229a1b673995105175
  Author: Ivan Ren <address@hidden>
  Date:   2019-07-24 (Wed, 24 Jul 2019)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration: fix migrate_cancel leads live_migration thread hung forever

When we 'migrate_cancel' a multifd migration, live_migration thread may
hung forever at some points, because of multifd_send_thread has already
exit for socket error:
1. multifd_send_pages may hung at qemu_sem_wait(&multifd_send_state->
   channels_ready)
2. multifd_send_sync_main my hung at qemu_sem_wait(&multifd_send_state->
   sem_sync)

Signed-off-by: Ivan Ren <address@hidden>
Message-Id: <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>

---

Remove spurious not needed bits


  Commit: 3c3ca25d1f067f93876730cb55c59d43194fe815
      
https://github.com/qemu/qemu/commit/3c3ca25d1f067f93876730cb55c59d43194fe815
  Author: Juan Quintela <address@hidden>
  Date:   2019-07-24 (Wed, 24 Jul 2019)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration: Make explicit that we are quitting multifd

We add a bool to indicate that.

Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>


  Commit: f193bc0c5342496ce07355c0c30394560a7f4738
      
https://github.com/qemu/qemu/commit/f193bc0c5342496ce07355c0c30394560a7f4738
  Author: Ivan Ren <address@hidden>
  Date:   2019-07-24 (Wed, 24 Jul 2019)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration: fix migrate_cancel multifd migration leads destination hung forever

When migrate_cancel a multifd migration, if run sequence like this:

        [source]                              [destination]

multifd_send_sync_main[finish]
                                    multifd_recv_thread wait &p->sem_sync
shutdown to_dst_file
                                    detect error from_src_file
send  RAM_SAVE_FLAG_EOS[fail]       [no chance to run multifd_recv_sync_main]
                                    multifd_load_cleanup
                                    join multifd receive thread forever

will lead destination qemu hung at following stack:

pthread_join
qemu_thread_join
multifd_load_cleanup
process_incoming_migration_co
coroutine_trampoline

Signed-off-by: Ivan Ren <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>


  Commit: b43bea01b853dfdb6c0418615b57d0e1b98e9e98
      
https://github.com/qemu/qemu/commit/b43bea01b853dfdb6c0418615b57d0e1b98e9e98
  Author: Peter Maydell <address@hidden>
  Date:   2019-07-25 (Thu, 25 Jul 2019)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  Merge remote-tracking branch 
'remotes/juanquintela/tags/migration-pull-request' into staging

Migration pull request

This series fixes problems with migration-cancel while using multifd.
In some cases it can hang waiting in a semaphore.

Please apply.

# gpg: Signature made Thu 25 Jul 2019 11:56:57 BST
# gpg:                using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <address@hidden>" [full]
# gpg:                 aka "Juan Quintela <address@hidden>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration-pull-request:
  migration: fix migrate_cancel multifd migration leads destination hung forever
  migration: Make explicit that we are quitting multifd
  migration: fix migrate_cancel leads live_migration thread hung forever
  migration: fix migrate_cancel leads live_migration thread endless loop

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/7ea53245335b...b43bea01b853



reply via email to

[Prev in Thread] Current Thread [Next in Thread]