qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] migration: ram block cpr blockers


From: Peter Xu
Subject: Re: [PATCH] migration: ram block cpr blockers
Date: Thu, 30 Jan 2025 12:01:04 -0500

On Wed, Jan 29, 2025 at 01:20:13PM -0500, Steven Sistare wrote:
> On 1/17/2025 6:57 PM, Peter Xu wrote:
> > On Fri, Jan 17, 2025 at 02:10:14PM -0500, Steven Sistare wrote:
> > > On 1/17/2025 1:16 PM, Peter Xu wrote:
> > > > On Fri, Jan 17, 2025 at 09:46:11AM -0800, Steve Sistare wrote:
> > > > > +/*
> > > > > + * Return true if ram contents would be lost during CPR.
> > > > > + * Return false for ram_device because it is remapped in new QEMU.  
> > > > > Do not
> > > > > + * exclude rom, even though it is readonly, because the rom file 
> > > > > could change
> > > > > + * in new QEMU.  Return false for non-migratable blocks.  They are 
> > > > > either
> > > > > + * re-created in new QEMU, or are handled specially, or are covered 
> > > > > by a
> > > > > + * device-level CPR blocker.  Return false for an fd, because it is 
> > > > > visible and
> > > > > + * can be remapped in new QEMU.
> > > > > + */
> > > > > +static bool ram_is_volatile(RAMBlock *rb)
> > > > > +{
> > > > > +    MemoryRegion *mr = rb->mr;
> > > > > +
> > > > > +    return mr &&
> > > > > +        memory_region_is_ram(mr) &&
> > > > > +        !memory_region_is_ram_device(mr) &&
> > > > > +        (!qemu_ram_is_shared(rb) || !qemu_ram_is_named_file(rb)) &&
> > > > > +        qemu_ram_is_migratable(rb) &&
> > > > > +        rb->fd < 0;
> > > > > +}
> > > > 
> > > > Blocking guest_memfd looks ok, but comparing to add one more block
> > > > notifier, can we check all ramblocks once in migrate_prepare(), and fail
> > > > that command directly if it fails the check?
> > > 
> > > In an upcoming patch, I will be adding an option analogous to 
> > > only-migratable which
> > > prevents QEMU from starting if anything would block cpr-transfer.  That 
> > > option
> > > will be checked when blockers are added, like for only-migratable. 
> > > migrate_prepare
> > > is too late.
> > > 
> > > > OTOH, is there any simpler way to simplify the check conditions?  It'll 
> > > > be
> > > > at least nice to break these checks into smaller if conditions for
> > > > readability..
> > > 
> > > I thought the function header comments made it clear, but I could move 
> > > each
> > > comment next to each condition:
> > > 
> > >      ...
> > >      /*
> > >       * Return false for an fd, because it is visible and can be remapped 
> > > in
> > >       * new QEMU.
> > >       */
> > >      if (rb->fd >= 0) {
> > >          return false;
> > >      }
> > >      ...
> > > 
> > > > I wonder if we could stick with looping over all ramblocks, then make 
> > > > sure
> > > > each of them is on the cpr saved fd list.  It may need to make
> > > > cpr_save_fd() always register with the name of ramblock to do such 
> > > > lookup,
> > > > or maybe we could also cache the ramblock pointer in CprFd, then the 
> > > > lookup
> > > > will be a pointer match check.
> > > 
> > > Some ramblocks are not on the list, such as named files.  Plus looping in
> > > migrate_prepare is too late as noted above.
> > > 
> > > IMO what I have already implemented using blockers is clean and elegant.
> > 
> > OK if we need to fail it early at boot, then yes blockers are probably
> > better.
> > 
> > We'll need one more cmdline parameter. I've no objection, but I don't know
> > how to judge when it's ok to add, when it's better not.. I'll leave others
> > to comment on this.
> > 
> > But still, could we check it when ramblocks are created?  So in that way
> > whatever is forbidden is clear in its own path, I feel like that could be
> > clearer (like what you did with gmemfd).
> 
> When the ramblock is created, we don't yet know if it is migratable. A
> ramblock that is not migratable does not block cpr. Migratable is not known
> until vmstate_register_ram calls qemu_ram_set_migratable.  Hence that is
> where I evaluate conditions and install a blocker.
> 
> Because that is the only place where ram_block_add_cpr_blocker is called,
> the test qemu_ram_is_migratable() inside ram_block_add_cpr_blocker is
> redundant, and I should delete it.

Hmm.. sounds reasonable.

> 
> > For example, if I start to convert some of your requirements above, then
> > memory_region_is_ram_device() implies RAM_PREALLOC.  Actually, ram_device
> > is not the only RAM_PREALLOC user..  Say, would it also not work with all
> > memory_region_init_ram_ptr() users (even if they're not ram_device)?  An
> > example is, looks like virtio-gpu can create random ramblocks on the fly
> > with prealloced buffers.  I am not sure whether they can be pinned by VFIO
> > too.  You may know better.
> 
> That memory is not visible to the guest.  It is not part of system_memory,
> and is not marked migratable.

I _think_ that can still be visible at least for the virtio-gpu use case,
which hangs under VirtIOGPUBase.hostmem.  Relevant code for reference:

virtio_gpu_virgl_map_resource_blob:
    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
    memory_region_add_subregion(&b->hostmem, offset, mr);
    memory_region_set_enabled(mr, true);

virtio_gpu_pci_base_realize:
        memory_region_init(&g->hostmem, OBJECT(g), "virtio-gpu-hostmem",
                           g->conf.hostmem);
        pci_register_bar(&vpci_dev->pci_dev, 4,
                         PCI_BASE_ADDRESS_SPACE_MEMORY |
                         PCI_BASE_ADDRESS_MEM_PREFETCH |
                         PCI_BASE_ADDRESS_MEM_TYPE_64,
                         &g->hostmem);

pci_update_mappings:
            memory_region_add_subregion_overlap(r->address_space,
                                                r->addr, r->memory, 1);

but indeed I'm not sure how it work with migration so far, because it
doesn't have RAM_MIGRATABLE set. So at least cpr didn't make it more
special.  I assume this isn't something we must figure out as of now
then.. but if you do, please kindly share.

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]