[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] migration: ram block cpr blockers
From: |
Peter Xu |
Subject: |
Re: [PATCH] migration: ram block cpr blockers |
Date: |
Thu, 30 Jan 2025 12:01:04 -0500 |
On Wed, Jan 29, 2025 at 01:20:13PM -0500, Steven Sistare wrote:
> On 1/17/2025 6:57 PM, Peter Xu wrote:
> > On Fri, Jan 17, 2025 at 02:10:14PM -0500, Steven Sistare wrote:
> > > On 1/17/2025 1:16 PM, Peter Xu wrote:
> > > > On Fri, Jan 17, 2025 at 09:46:11AM -0800, Steve Sistare wrote:
> > > > > +/*
> > > > > + * Return true if ram contents would be lost during CPR.
> > > > > + * Return false for ram_device because it is remapped in new QEMU.
> > > > > Do not
> > > > > + * exclude rom, even though it is readonly, because the rom file
> > > > > could change
> > > > > + * in new QEMU. Return false for non-migratable blocks. They are
> > > > > either
> > > > > + * re-created in new QEMU, or are handled specially, or are covered
> > > > > by a
> > > > > + * device-level CPR blocker. Return false for an fd, because it is
> > > > > visible and
> > > > > + * can be remapped in new QEMU.
> > > > > + */
> > > > > +static bool ram_is_volatile(RAMBlock *rb)
> > > > > +{
> > > > > + MemoryRegion *mr = rb->mr;
> > > > > +
> > > > > + return mr &&
> > > > > + memory_region_is_ram(mr) &&
> > > > > + !memory_region_is_ram_device(mr) &&
> > > > > + (!qemu_ram_is_shared(rb) || !qemu_ram_is_named_file(rb)) &&
> > > > > + qemu_ram_is_migratable(rb) &&
> > > > > + rb->fd < 0;
> > > > > +}
> > > >
> > > > Blocking guest_memfd looks ok, but comparing to add one more block
> > > > notifier, can we check all ramblocks once in migrate_prepare(), and fail
> > > > that command directly if it fails the check?
> > >
> > > In an upcoming patch, I will be adding an option analogous to
> > > only-migratable which
> > > prevents QEMU from starting if anything would block cpr-transfer. That
> > > option
> > > will be checked when blockers are added, like for only-migratable.
> > > migrate_prepare
> > > is too late.
> > >
> > > > OTOH, is there any simpler way to simplify the check conditions? It'll
> > > > be
> > > > at least nice to break these checks into smaller if conditions for
> > > > readability..
> > >
> > > I thought the function header comments made it clear, but I could move
> > > each
> > > comment next to each condition:
> > >
> > > ...
> > > /*
> > > * Return false for an fd, because it is visible and can be remapped
> > > in
> > > * new QEMU.
> > > */
> > > if (rb->fd >= 0) {
> > > return false;
> > > }
> > > ...
> > >
> > > > I wonder if we could stick with looping over all ramblocks, then make
> > > > sure
> > > > each of them is on the cpr saved fd list. It may need to make
> > > > cpr_save_fd() always register with the name of ramblock to do such
> > > > lookup,
> > > > or maybe we could also cache the ramblock pointer in CprFd, then the
> > > > lookup
> > > > will be a pointer match check.
> > >
> > > Some ramblocks are not on the list, such as named files. Plus looping in
> > > migrate_prepare is too late as noted above.
> > >
> > > IMO what I have already implemented using blockers is clean and elegant.
> >
> > OK if we need to fail it early at boot, then yes blockers are probably
> > better.
> >
> > We'll need one more cmdline parameter. I've no objection, but I don't know
> > how to judge when it's ok to add, when it's better not.. I'll leave others
> > to comment on this.
> >
> > But still, could we check it when ramblocks are created? So in that way
> > whatever is forbidden is clear in its own path, I feel like that could be
> > clearer (like what you did with gmemfd).
>
> When the ramblock is created, we don't yet know if it is migratable. A
> ramblock that is not migratable does not block cpr. Migratable is not known
> until vmstate_register_ram calls qemu_ram_set_migratable. Hence that is
> where I evaluate conditions and install a blocker.
>
> Because that is the only place where ram_block_add_cpr_blocker is called,
> the test qemu_ram_is_migratable() inside ram_block_add_cpr_blocker is
> redundant, and I should delete it.
Hmm.. sounds reasonable.
>
> > For example, if I start to convert some of your requirements above, then
> > memory_region_is_ram_device() implies RAM_PREALLOC. Actually, ram_device
> > is not the only RAM_PREALLOC user.. Say, would it also not work with all
> > memory_region_init_ram_ptr() users (even if they're not ram_device)? An
> > example is, looks like virtio-gpu can create random ramblocks on the fly
> > with prealloced buffers. I am not sure whether they can be pinned by VFIO
> > too. You may know better.
>
> That memory is not visible to the guest. It is not part of system_memory,
> and is not marked migratable.
I _think_ that can still be visible at least for the virtio-gpu use case,
which hangs under VirtIOGPUBase.hostmem. Relevant code for reference:
virtio_gpu_virgl_map_resource_blob:
memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
memory_region_add_subregion(&b->hostmem, offset, mr);
memory_region_set_enabled(mr, true);
virtio_gpu_pci_base_realize:
memory_region_init(&g->hostmem, OBJECT(g), "virtio-gpu-hostmem",
g->conf.hostmem);
pci_register_bar(&vpci_dev->pci_dev, 4,
PCI_BASE_ADDRESS_SPACE_MEMORY |
PCI_BASE_ADDRESS_MEM_PREFETCH |
PCI_BASE_ADDRESS_MEM_TYPE_64,
&g->hostmem);
pci_update_mappings:
memory_region_add_subregion_overlap(r->address_space,
r->addr, r->memory, 1);
but indeed I'm not sure how it work with migration so far, because it
doesn't have RAM_MIGRATABLE set. So at least cpr didn't make it more
special. I assume this isn't something we must figure out as of now
then.. but if you do, please kindly share.
Thanks,
--
Peter Xu