qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] 301d7f: migration: Fix migration crash when t


From: Peter Maydell
Subject: [Qemu-commits] [qemu/qemu] 301d7f: migration: Fix migration crash when target psize l...
Date: Tue, 07 Feb 2023 07:22:22 -0800

  Branch: refs/heads/staging
  Home:   https://github.com/qemu/qemu
  Commit: 301d7ffe5f630dc5d0e2a3638b9eae7a00b1088a
      
https://github.com/qemu/qemu/commit/301d7ffe5f630dc5d0e2a3638b9eae7a00b1088a
  Author: Peter Xu <peterx@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration: Fix migration crash when target psize larger than host

Commit d9e474ea56 overlooked the case where the target psize is even larger
than the host psize.  One example is Alpha has 8K page size and migration
will start to crash the source QEMU when running Alpha migration on x86.

Fix it by detecting that case and set host start/end just to cover the
single page to be migrated.

This will slightly optimize the common case where host psize equals to
guest psize so we don't even need to do the roundups, but that's trivial.

Cc: qemu-stable@nongnu.org
Reported-by: Thomas Huth <thuth@redhat.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1456
Fixes: d9e474ea56 ("migration: Teach PSS about host page")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 255dc7af7e65588d36319129718ddfdfeabac898
      
https://github.com/qemu/qemu/commit/255dc7af7e65588d36319129718ddfdfeabac898
  Author: Juan Quintela <quintela@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M hw/s390x/s390-stattrib.c
    M hw/vfio/migration.c
    M include/migration/register.h
    M migration/block-dirty-bitmap.c
    M migration/block.c
    M migration/migration.c
    M migration/ram.c
    M migration/savevm.c
    M migration/savevm.h

  Log Message:
  -----------
  migration: No save_live_pending() method uses the QEMUFile parameter

So remove it everywhere.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>


  Commit: c8df4a7aeffcb46020f610526eea621fa5b0cd47
      
https://github.com/qemu/qemu/commit/c8df4a7aeffcb46020f610526eea621fa5b0cd47
  Author: Juan Quintela <quintela@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M docs/devel/migration.rst
    M docs/devel/vfio-migration.rst
    M hw/s390x/s390-stattrib.c
    M hw/vfio/migration.c
    M hw/vfio/trace-events
    M include/migration/register.h
    M migration/block-dirty-bitmap.c
    M migration/block.c
    M migration/migration.c
    M migration/ram.c
    M migration/savevm.c
    M migration/savevm.h
    M migration/trace-events

  Log Message:
  -----------
  migration: Split save_live_pending() into state_pending_*

We split the function into to:

- state_pending_estimate: We estimate the remaining state size without
  stopping the machine.

- state pending_exact: We calculate the exact amount of remaining
  state.

The only "device" that implements different functions for _estimate()
and _exact() is ram.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>


  Commit: fd70385d38bb75128c1bdfc027af81cc41ec0e48
      
https://github.com/qemu/qemu/commit/fd70385d38bb75128c1bdfc027af81cc41ec0e48
  Author: Juan Quintela <quintela@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M hw/s390x/s390-stattrib.c
    M hw/vfio/migration.c
    M include/migration/register.h
    M migration/block-dirty-bitmap.c
    M migration/block.c
    M migration/migration.c
    M migration/ram.c
    M migration/savevm.c
    M migration/savevm.h
    M migration/trace-events

  Log Message:
  -----------
  migration: Remove unused threshold_size parameter

Until previous commit, save_live_pending() was used for ram.  Now with
the split into state_pending_estimate() and state_pending_exact() it
is not needed anymore, so remove them.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>


  Commit: d9df92925ef2b7ca8774ef44b0e1f859a91d4cd6
      
https://github.com/qemu/qemu/commit/d9df92925ef2b7ca8774ef44b0e1f859a91d4cd6
  Author: Juan Quintela <quintela@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/migration.c

  Log Message:
  -----------
  migration: simplify migration_iteration_run()

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>


  Commit: d5890ea0722831eea76a0efd23a496b3e8815fe8
      
https://github.com/qemu/qemu/commit/d5890ea0722831eea76a0efd23a496b3e8815fe8
  Author: Peter Xu <peterx@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M include/qemu/userfaultfd.h
    M migration/postcopy-ram.c
    M tests/qtest/migration-test.c
    M util/userfaultfd.c

  Log Message:
  -----------
  util/userfaultfd: Add uffd_open()

Add a helper to create the uffd handle.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 5f19a4491941fdc5c5b50ce4ade6ffffe0f591b4
      
https://github.com/qemu/qemu/commit/5f19a4491941fdc5c5b50ce4ade6ffffe0f591b4
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration/ram: Fix populate_read_range()

Unfortunately, commit f7b9dcfbcf44 broke populate_read_range(): the loop
end condition is very wrong, resulting in that function not populating the
full range. Lets' fix that.

Fixes: f7b9dcfbcf44 ("migration/ram: Factor out populating pages readable in 
ram_block_populate_pages()")
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 72ef3a370836aa07261ad7aaeea27ed5cbcee342
      
https://github.com/qemu/qemu/commit/72ef3a370836aa07261ad7aaeea27ed5cbcee342
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration/ram: Fix error handling in ram_write_tracking_start()

If something goes wrong during uffd_change_protection(), we would miss
to unregister uffd-wp and not release our reference. Fix it by
performing the uffd_change_protection(true) last.

Note that a uffd_change_protection(false) on the recovery path without a
prior uffd_change_protection(false) is fine.

Fixes: 278e2f551a09 ("migration: support UFFD write fault processing in 
ram_save_iterate()")
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 7cc8e9e0fadc734065d4d5c9cb0bd8997e743146
      
https://github.com/qemu/qemu/commit/7cc8e9e0fadc734065d4d5c9cb0bd8997e743146
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration/ram: Don't explicitly unprotect when unregistering uffd-wp

When unregistering uffd-wp, older kernels before commit f369b07c86143
("mm/uffd:reset write protection when unregister with wp-mode") won't
clear the uffd-wp PTE bit. When re-registering uffd-wp, the previous
uffd-wp PTE bits would trigger again. With above commit, the kernel will
clear the uffd-wp PTE bits when unregistering itself.

Consequently, we'll clear the uffd-wp PTE bits now twice -- whereby we
don't care about clearing them at all: a new background snapshot will
re-register uffd-wp and re-protect all memory either way.

So let's skip the manual clearing of uffd-wp. If ever relevant, we
could clear conditionally in uffd_unregister_memory() -- we just need a
way to figure out more recent kernels.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 59bcc049c17a50d8ac0353f164f597e7d904589d
      
https://github.com/qemu/qemu/commit/59bcc049c17a50d8ac0353f164f597e7d904589d
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration/ram: Rely on used_length for uffd_change_protection()

ram_mig_ram_block_resized() will abort migration (including background
snapshots) when resizing a RAMBlock. ram_block_populate_read() will only
populate RAM up to used_length, so at least for anonymous memory
protecting everything between used_length and max_length won't
actually be protected and is just a NOP.

So let's only protect everything up to used_length.

Note: it still makes sense to register uffd-wp for max_length, such
that RAM_UF_WRITEPROTECT is independent of a changing used_length.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: e41c57702e940fcb9a8046edc3b43edda5134305
      
https://github.com/qemu/qemu/commit/e41c57702e940fcb9a8046edc3b43edda5134305
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration/ram: Optimize ram_write_tracking_start() for RamDiscardManager

ram_block_populate_read() already optimizes for RamDiscardManager.
However, ram_write_tracking_start() will still try protecting discarded
memory ranges.

Let's optimize, because discarded ranges don't map any pages and

(1) For anonymous memory, trying to protect using uffd-wp without a mapped
    page is ignored by the kernel and consequently a NOP.

(2) For shared/file-backed memory, we will fill present page tables in the
    range with PTE markers. However, we will even allocate page tables
    just to fill them with unnecessary PTE markers and effectively
    waste memory.

So let's exclude these ranges, just like ram_block_populate_read()
already does.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 5e104f24e7ddfa33d5e99b6363c7baf02849f9b7
      
https://github.com/qemu/qemu/commit/5e104f24e7ddfa33d5e99b6363c7baf02849f9b7
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/savevm.c

  Log Message:
  -----------
  migration/savevm: Move more savevm handling into vmstate_save()

Let's move more code into vmstate_save(), reducing code duplication and
preparing for reuse of vmstate_save() in qemu_savevm_state_setup(). We
have to move vmstate_save() to make the compiler happy.

We'll now also trace from qemu_save_device_state(), triggering the same
tracepoints as previously called from
qemu_savevm_state_complete_precopy_non_iterable() only. Note that
qemu_save_device_state() ignores iterable device state, such as RAM,
and consequently doesn't trigger some other trace points (e.g.,
trace_savevm_state_setup()).

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: e3bf5e68e2a97898f37834c47449101172ced123
      
https://github.com/qemu/qemu/commit/e3bf5e68e2a97898f37834c47449101172ced123
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/migration.c
    M migration/migration.h
    M migration/savevm.c

  Log Message:
  -----------
  migration/savevm: Prepare vmdesc json writer in qemu_savevm_state_setup()

... and store it in the migration state. This is a preparation for
storing selected vmds's already in qemu_savevm_state_setup().

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 62f42625d4e27a1993ab1999d0e86aedabf9a961
      
https://github.com/qemu/qemu/commit/62f42625d4e27a1993ab1999d0e86aedabf9a961
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M include/migration/vmstate.h
    M migration/savevm.c

  Log Message:
  -----------
  migration/savevm: Allow immutable device state to be migrated early (i.e., 
before RAM)

For virtio-mem, we want to have the plugged/unplugged state of memory
blocks available before migrating any actual RAM content, and perform
sanity checks before touching anything on the destination. This
information is immutable on the migration source while migration is active,

We want to use this information for proper preallocation support with
migration: currently, we don't preallocate memory on the migration target,
and especially with hugetlb, we can easily run out of hugetlb pages during
RAM migration and will crash (SIGBUS) instead of catching this gracefully
via preallocation.

Migrating device state via a VMSD before we start iterating is currently
impossible: the only approach that would be possible is avoiding a VMSD
and migrating state manually during save_setup(), to be restored during
load_state().

Let's allow for migrating device state via a VMSD early, during the
setup phase in qemu_savevm_state_setup(). To keep it simple, we
indicate applicable VMSD's using an "early_setup" flag.

Note that only very selected devices (i.e., ones seriously messing with
RAM setup) are supposed to make use of such early state migration.

While at it, also use a bool for the "unmigratable" member.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>S
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 508f7988fd221f1f66c3f8a025c8a2dadac0af01
      
https://github.com/qemu/qemu/commit/508f7988fd221f1f66c3f8a025c8a2dadac0af01
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M include/migration/vmstate.h

  Log Message:
  -----------
  migration/vmstate: Introduce VMSTATE_WITH_TMP_TEST() and VMSTATE_BITMAP_TEST()

We'll make use of both next in the context of virtio-mem.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>S
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 80fe315c384153af957ee94d43d08b90ad1d5ef7
      
https://github.com/qemu/qemu/commit/80fe315c384153af957ee94d43d08b90ad1d5ef7
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M include/migration/misc.h
    M migration/migration.c
    M migration/ram.c

  Log Message:
  -----------
  migration/ram: Factor out check for advised postcopy

Let's factor out this check, to be used in virtio-mem context next.

While at it, fix a spelling error in a related comment.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>S
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: ce1761f0f9f0dde30a56cdcff68c034874fb91a0
      
https://github.com/qemu/qemu/commit/ce1761f0f9f0dde30a56cdcff68c034874fb91a0
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M hw/virtio/virtio-mem.c

  Log Message:
  -----------
  virtio-mem: Fail if a memory backend with "prealloc=on" is specified

"prealloc=on" for the memory backend does not work as expected, as
virtio-mem will simply discard all preallocated memory immediately again.
In the best case, it's an expensive NOP. In the worst case, it's an
unexpected allocation error.

Instead, "prealloc=on" should be specified for the virtio-mem device only,
such that virtio-mem will try preallocating memory before plugging
memory dynamically to the guest. Fail if such a memory backend is
provided.

Tested-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>S
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 3b95a71b22827d261786b84f38b1e9109f6bf57b
      
https://github.com/qemu/qemu/commit/3b95a71b22827d261786b84f38b1e9109f6bf57b
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M hw/core/machine.c
    M hw/virtio/virtio-mem.c
    M include/hw/virtio/virtio-mem.h

  Log Message:
  -----------
  virtio-mem: Migrate immutable properties early

The bitmap and the size are immutable while migration is active: see
virtio_mem_is_busy(). We can migrate this information early, before
migrating any actual RAM content. Further, all information we need for
sanity checks is immutable as well.

Having this information in place early will, for example, allow for
properly preallocating memory before touching these memory locations
during RAM migration: this way, we can make sure that all memory was
actually preallocated and that any user errors (e.g., insufficient
hugetlb pages) can be handled gracefully.

In contrast, usable_region_size and requested_size can theoretically
still be modified on the source while the VM is running. Keep migrating
these properties the usual, late, way.

Use a new device property to keep behavior of compat machines
unmodified.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>S
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: d71920d42548d2bad17544cb488b09cece81a821
      
https://github.com/qemu/qemu/commit/d71920d42548d2bad17544cb488b09cece81a821
  Author: David Hildenbrand <david@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M hw/virtio/virtio-mem.c

  Log Message:
  -----------
  virtio-mem: Proper support for preallocation with migration

Ordinary memory preallocation runs when QEMU starts up and creates the
memory backends, before processing the incoming migration stream. With
virtio-mem, we don't know which memory blocks to preallocate before
migration started. Now that we migrate the virtio-mem bitmap early, before
migrating any RAM content, we can safely preallocate memory for all plugged
memory blocks before migrating any RAM content.

This is especially relevant for the following cases:

(1) User errors

With hugetlb/files, if we don't have sufficient backend memory available on
the migration destination, we'll crash QEMU (SIGBUS) during RAM migration
when running out of backend memory. Preallocating memory before actual
RAM migration allows for failing gracefully and informing the user about
the setup problem.

(2) Excluded memory ranges during migration

For example, virtio-balloon free page hinting will exclude some pages
from getting migrated. In that case, we won't crash during RAM
migration, but later, when running the VM on the destination, which is
bad.

To fix this for new QEMU machines that migrate the bitmap early,
preallocate the memory early, before any RAM migration. Warn with old
QEMU machines.

Getting postcopy right is a bit tricky, but we essentially now implement
the same (problematic) preallocation logic as ordinary preallocation:
preallocate memory early and discard it again before precopy starts. During
ordinary preallocation, discarding of RAM happens when postcopy is advised.
As the state (bitmap) is loaded after postcopy was advised but before
postcopy starts listening, we have to discard memory we preallocated
immediately again ourselves.

Note that nothing (not even hugetlb reservations) guarantees for postcopy
that backend memory (especially, hugetlb pages) are still free after they
were freed ones while discarding RAM. Still, allocating that memory at
least once helps catching some basic setup problems.

Before this change, trying to restore a VM when insufficient hugetlb
pages are around results in the process crashing to to a "Bus error"
(SIGBUS). With this change, QEMU fails gracefully:

  qemu-system-x86_64: qemu_prealloc_mem: preallocating memory failed: Bad 
address
  qemu-system-x86_64: error while loading state for instance 0x0 of device 
'0000:00:03.0/virtio-mem-device-early'
  qemu-system-x86_64: load of migration failed: Cannot allocate memory

And we can even introspect the early migration data, including the
bitmap:
  $ ./scripts/analyze-migration.py -f STATEFILE
  {
  "ram (2)": {
      "section sizes": {
          "0000:00:03.0/mem0": "0x0000000780000000",
          "0000:00:04.0/mem1": "0x0000000780000000",
          "pc.ram": "0x0000000100000000",
          "/rom@etc/acpi/tables": "0x0000000000020000",
          "pc.bios": "0x0000000000040000",
          "0000:00:02.0/e1000.rom": "0x0000000000040000",
          "pc.rom": "0x0000000000020000",
          "/rom@etc/table-loader": "0x0000000000001000",
          "/rom@etc/acpi/rsdp": "0x0000000000001000"
      }
  },
  "0000:00:03.0/virtio-mem-device-early (51)": {
      "tmp": "00 00 00 01 40 00 00 00 00 00 00 07 80 00 00 00 00 00 00 00 00 20 
00 00 00 00 00 00",
      "size": "0x0000000040000000",
      "bitmap": "ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [...]
  },
  "0000:00:04.0/virtio-mem-device-early (53)": {
      "tmp": "00 00 00 08 c0 00 00 00 00 00 00 07 80 00 00 00 00 00 00 00 00 20 
00 00 00 00 00 00",
      "size": "0x00000001fa400000",
      "bitmap": "ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [...]
  },
  [...]

Reported-by: Jing Qi <jinqi@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>S
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: db18dee7d7b069653ae748d68d9d99313dde74c4
      
https://github.com/qemu/qemu/commit/db18dee7d7b069653ae748d68d9d99313dde74c4
  Author: Peter Xu <peterx@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/migration.c

  Log Message:
  -----------
  migration: Show downtime during postcopy phase

The downtime should be displayed during postcopy phase because the
switchover phase is done.  OTOH it's weird to show "expected downtime"
which can confuse what does that mean if the switchover has already
happened anyway.

This is a slight ABI change on QMP, but I assume it shouldn't affect
anyone.

Reviewed-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 74ecf6ac2b7e53cf480f1f2dc7a3af41525fb588
      
https://github.com/qemu/qemu/commit/74ecf6ac2b7e53cf480f1f2dc7a3af41525fb588
  Author: Fiona Ebner <f.ebner@proxmox.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/rdma.c

  Log Message:
  -----------
  migration/rdma: fix return value for qio_channel_rdma_{readv,writev}

upon errors. As the documentation in include/io/channel.h states, only
-1 and QIO_CHANNEL_ERR_BLOCK should be returned upon error. Other
values have the potential to confuse the call sites.

error_setg is used rather than error_setg_errno, because there are
certain code paths where -1 (as a non-errno) is propagated up (e.g.
starting from qemu_rdma_block_for_wrid or qemu_rdma_post_recv_control)
all the way to qio_channel_rdma_{readv,writev}.

Similar to a216ec85b7 ("migration/channel-block: fix return value for
qio_channel_block_{readv,writev}").

Suggested-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 89c568489122de996920b760c34e81b925cc8181
      
https://github.com/qemu/qemu/commit/89c568489122de996920b760c34e81b925cc8181
  Author: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M include/migration/vmstate.h
    M migration/savevm.c
    M migration/vmstate.c

  Log Message:
  -----------
  migration: Add canary to VMSTATE_END_OF_LIST

We fairly regularly forget VMSTATE_END_OF_LIST markers off descriptions;
given that the current check is only for ->name being NULL, sometimes
we get unlucky and the code apparently works and no one spots the error.

Explicitly add a flag, VMS_END that should be set, and assert it is
set during the traversal.

Note: This can't go in until we update the copy of vmstate.h in slirp.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: bb25a7289561d67133a7e7d69b15d81ead507a9e
      
https://github.com/qemu/qemu/commit/bb25a7289561d67133a7e7d69b15d81ead507a9e
  Author: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/savevm.c

  Log Message:
  -----------
  migration: Perform vmsd structure check during tests

Perform a check on vmsd structures during test runs in the hope
of catching any missing terminators and other simple screwups.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: bd9510d38546a19aa2e58e1a94597acfb0fd82d4
      
https://github.com/qemu/qemu/commit/bd9510d38546a19aa2e58e1a94597acfb0fd82d4
  Author: Zhenzhong Duan <zhenzhong.duan@intel.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/dirtyrate.c

  Log Message:
  -----------
  migration/dirtyrate: Show sample pages only in page-sampling mode

The value of "Sample Pages" is confusing in mode other than page-sampling.
See below:

(qemu) calc_dirty_rate -b 10 520
(qemu) info dirty_rate
Status: measuring
Start Time: 11646834 (ms)
Sample Pages: 520 (per GB)
Period: 10 (sec)
Mode: dirty-bitmap
Dirty rate: (not ready)

(qemu) info dirty_rate
Status: measured
Start Time: 11646834 (ms)
Sample Pages: 0 (per GB)
Period: 10 (sec)
Mode: dirty-bitmap
Dirty rate: 2 (MB/s)

While it's totally useless in dirty-ring and dirty-bitmap mode, fix to
show it only in page-sampling mode.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 84615a19ddf2bfb38d7b3a0d487d2397ee55e4f3
      
https://github.com/qemu/qemu/commit/84615a19ddf2bfb38d7b3a0d487d2397ee55e4f3
  Author: manish.mishra <manish.mishra@nutanix.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M chardev/char-socket.c
    M include/io/channel.h
    M io/channel-buffer.c
    M io/channel-command.c
    M io/channel-file.c
    M io/channel-null.c
    M io/channel-socket.c
    M io/channel-tls.c
    M io/channel-websock.c
    M io/channel.c
    M migration/channel-block.c
    M migration/rdma.c
    M scsi/qemu-pr-helper.c
    M tests/qtest/tpm-emu.c
    M tests/unit/test-io-channel-socket.c
    M util/vhost-user-server.c

  Log Message:
  -----------
  io: Add support for MSG_PEEK for socket channel

MSG_PEEK peeks at the channel, The data is treated as unread and
the next read shall still return this data. This support is
currently added only for socket class. Extra parameter 'flags'
is added to io_readv calls to pass extra read flags like MSG_PEEK.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: manish.mishra <manish.mishra@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 6720c2b32725e6ac404f22851a0ecd0a71d0cbe2
      
https://github.com/qemu/qemu/commit/6720c2b32725e6ac404f22851a0ecd0a71d0cbe2
  Author: manish.mishra <manish.mishra@nutanix.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/channel.c
    M migration/channel.h
    M migration/migration.c
    M migration/multifd.c
    M migration/multifd.h
    M migration/postcopy-ram.c
    M migration/postcopy-ram.h

  Log Message:
  -----------
  migration: check magic value for deciding the mapping of channels

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the main channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, tls live
migrations already does tls handshake before creating other channels, so
this issue is not possible with tls, hence this logic is avoided for tls
live migrations. This patch uses read peek to check the magic number of
channels so that current data/control stream management remains
un-effected.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: manish.mishra <manish.mishra@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: ddbe628c97c3a2d211c6d96383cb4063ac3ad0f9
      
https://github.com/qemu/qemu/commit/ddbe628c97c3a2d211c6d96383cb4063ac3ad0f9
  Author: Zhenzhong Duan <zhenzhong.duan@intel.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/multifd.c

  Log Message:
  -----------
  multifd: Fix a race on reading MultiFDPages_t.block

In multifd_queue_page() MultiFDPages_t.block is checked twice.
Between the two checks, MultiFDPages_t.block may be reset to NULL
by multifd thread. This lead to the 2nd check always true then a
redundant page submitted to multifd thread again.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: ebfc57871506b3fe36cc41f69ee3ad31a34afd63
      
https://github.com/qemu/qemu/commit/ebfc57871506b3fe36cc41f69ee3ad31a34afd63
  Author: Zhenzhong Duan <zhenzhong.duan@intel.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    A configs/devices/x86_64-softmmu/x86_64-quintela-devices.mak
    A configs/devices/x86_64-softmmu/x86_64-quintela2-devices.mak
    M migration/multifd.c
    A migration/multifd.c.orig

  Log Message:
  -----------
  multifd: Fix flush of zero copy page send request

Make IO channel flush call after the inflight request has been drained
in multifd thread, or else we may missed to flush the inflight request.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 671326201dac8fe91222ba0045709f04a8ec3af4
      
https://github.com/qemu/qemu/commit/671326201dac8fe91222ba0045709f04a8ec3af4
  Author: Jiang Jiacheng <jiangjiacheng@huawei.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/meson.build
    A migration/threadinfo.c
    A migration/threadinfo.h
    M qapi/migration.json

  Log Message:
  -----------
  migration: Introduce interface query-migrationthreads

Introduce interface query-migrationthreads. The interface is used
to query information about migration threads and returns with
migration thread's name and its id.
Introduce threadinfo.c to manage threads with migration.

Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: 1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
      
https://github.com/qemu/qemu/commit/1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
  Author: Jiang Jiacheng <jiangjiacheng@huawei.com>
  Date:   2023-02-06 (Mon, 06 Feb 2023)

  Changed paths:
    M migration/migration.c
    M migration/multifd.c

  Log Message:
  -----------
  migration: save/delete migration thread info

To support query migration thread infomation, save and delete
thread(live_migration and multifdsend) information at thread
creation and finish.

Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>


  Commit: b86307ecef9222c335ebd0ed4da2b243e86f779e
      
https://github.com/qemu/qemu/commit/b86307ecef9222c335ebd0ed4da2b243e86f779e
  Author: Peter Maydell <peter.maydell@linaro.org>
  Date:   2023-02-07 (Tue, 07 Feb 2023)

  Changed paths:
    M chardev/char-socket.c
    A configs/devices/x86_64-softmmu/x86_64-quintela-devices.mak
    A configs/devices/x86_64-softmmu/x86_64-quintela2-devices.mak
    M docs/devel/migration.rst
    M docs/devel/vfio-migration.rst
    M hw/core/machine.c
    M hw/s390x/s390-stattrib.c
    M hw/vfio/migration.c
    M hw/vfio/trace-events
    M hw/virtio/virtio-mem.c
    M include/hw/virtio/virtio-mem.h
    M include/io/channel.h
    M include/migration/misc.h
    M include/migration/register.h
    M include/migration/vmstate.h
    M include/qemu/userfaultfd.h
    M io/channel-buffer.c
    M io/channel-command.c
    M io/channel-file.c
    M io/channel-null.c
    M io/channel-socket.c
    M io/channel-tls.c
    M io/channel-websock.c
    M io/channel.c
    M migration/block-dirty-bitmap.c
    M migration/block.c
    M migration/channel-block.c
    M migration/channel.c
    M migration/channel.h
    M migration/dirtyrate.c
    M migration/meson.build
    M migration/migration.c
    M migration/migration.h
    M migration/multifd.c
    A migration/multifd.c.orig
    M migration/multifd.h
    M migration/postcopy-ram.c
    M migration/postcopy-ram.h
    M migration/ram.c
    M migration/rdma.c
    M migration/savevm.c
    M migration/savevm.h
    A migration/threadinfo.c
    A migration/threadinfo.h
    M migration/trace-events
    M migration/vmstate.c
    M qapi/migration.json
    M scsi/qemu-pr-helper.c
    M tests/qtest/migration-test.c
    M tests/qtest/tpm-emu.c
    M tests/unit/test-io-channel-socket.c
    M util/userfaultfd.c
    M util/vhost-user-server.c

  Log Message:
  -----------
  Merge tag 'migration-20230206-pull-request' of 
https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request

In this try
- rebase to latest upstream
- same than previous patch
- fix compilation on non linux (userfaultfd.h) (me)
- query-migrationthreads (jiang)
- fix race on reading MultiFDPages_t.block (zhenzhong)
- fix flush of zero copy page send reuest  (zhenzhong)

Please apply.

Previous try:
It includes:
- David Hildenbrand fixes for virtio-men
- David Gilbert canary to detect problems
- Fix for rdma return values (Fiona)
- Peter Xu uffd_open fixes
- Peter Xu show right downtime for postcopy
- manish.mishra msg fix fixes
- my vfio changes.

Please apply.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmPhobYACgkQ9IfvGFhy
# 1yMNaA/9EHDPqrI1HL/VkJG4nNOOsQR7RbburXEberZOzvLjnqpjUD3Ls9qV6rx+
# ieHa5T4imYJFk72Wa5vx4r1/dCjtJD2W6jg5+/0nTvYAHrs1U1VRqpuTr0HiXdbJ
# ZLLCnW5eDyO3eMaOX0MUkgHgL0FNkc/Lq5ViCTFsMu9O9xMuDLLdAC3cdvslKuOu
# X1gKByr9jT817Y9e36amYmRaJKC6Cr/PIekNVFu12HBW79pPusLX8KWEf4RBw4HR
# sPwTvMCR/BwZ0+2Lppan60G5rt/ZxDu40oU7y+RHlfWqevl4hDM84/nhjMvEgzc5
# a4Ahe2ERGLwwnC8z3l7v9+pEzSGzDoPcnRGvZcpUpk68wTDtxd5Bdq8CwmNUfL07
# VzWcYpH0yvmwjBba9jfn9fAVgnG5rVp558XcYLIII3wEToty3UDtm43wSdj2CGr6
# cu+IPAp+n/I5G9SRYBTU9ozJz45ttnEe0hxUtZ4I3MuhzHi1VEDAqTWM/X0LyS41
# TB3Y5B2KKpJYbPyZEH4nyTeetR2k7alTFzahCgKqVfOgL0nJx54petjS1K+B1P72
# g6lhP9WnQ33W+M8S7J/aGEaDJd1lFyFB2Rdjn2ZZnASH/fR9j0mFmXWvulXtjFNp
# Sfim3887+Iv4Uzw4VWEe3mM5Ypi/Ba2CmuTjy/pM08Ey8X1Qs5o=
# =ZQbR
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Feb 2023 00:56:22 GMT
# gpg:                using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg:                 aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* tag 'migration-20230206-pull-request' of 
https://gitlab.com/juan.quintela/qemu: (30 commits)
  migration: save/delete migration thread info
  migration: Introduce interface query-migrationthreads
  multifd: Fix flush of zero copy page send request
  multifd: Fix a race on reading MultiFDPages_t.block
  migration: check magic value for deciding the mapping of channels
  io: Add support for MSG_PEEK for socket channel
  migration/dirtyrate: Show sample pages only in page-sampling mode
  migration: Perform vmsd structure check during tests
  migration: Add canary to VMSTATE_END_OF_LIST
  migration/rdma: fix return value for qio_channel_rdma_{readv,writev}
  migration: Show downtime during postcopy phase
  virtio-mem: Proper support for preallocation with migration
  virtio-mem: Migrate immutable properties early
  virtio-mem: Fail if a memory backend with "prealloc=on" is specified
  migration/ram: Factor out check for advised postcopy
  migration/vmstate: Introduce VMSTATE_WITH_TMP_TEST() and VMSTATE_BITMAP_TEST()
  migration/savevm: Allow immutable device state to be migrated early (i.e., 
before RAM)
  migration/savevm: Prepare vmdesc json writer in qemu_savevm_state_setup()
  migration/savevm: Move more savevm handling into vmstate_save()
  migration/ram: Optimize ram_write_tracking_start() for RamDiscardManager
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>


Compare: https://github.com/qemu/qemu/compare/285ee77f5b58...b86307ecef92



reply via email to

[Prev in Thread] Current Thread [Next in Thread]