|
From: | Alexey Kardashevskiy |
Subject: | Re: [PATCH 5/7] memory: Register the RamDiscardManager instance upon guest_memfd creation |
Date: | Fri, 24 Jan 2025 14:27:40 +1100 |
User-agent: | Mozilla Thunderbird Beta |
On 21/1/25 00:06, David Hildenbrand wrote:
On 10.01.25 06:13, Chenyi Qiang wrote:On 1/9/2025 5:32 PM, Alexey Kardashevskiy wrote:On 9/1/25 16:34, Chenyi Qiang wrote:On 1/8/2025 12:47 PM, Alexey Kardashevskiy wrote:On 13/12/24 18:08, Chenyi Qiang wrote:Introduce the realize()/unrealize() callbacks to initialize/ uninitialize the new guest_memfd_manager object and register/unregister it in the target MemoryRegion. Guest_memfd was initially set to shared until the commit bd3bcf6962("kvm/memory: Make memory type private by default if it has guest memfdbackend"). To align with this change, the default state in guest_memfd_manager is set to private. (The bitmap is cleared to 0). Additionally, setting the default to private can also reduce the overhead of mapping shared pages into IOMMU by VFIO during the bootup stage. Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> --- include/sysemu/guest-memfd-manager.h | 27 +++++++++++++++++++++++ ++++ system/guest-memfd-manager.c | 28 +++++++++++++++++++++++ ++++- system/physmem.c | 7 +++++++ 3 files changed, 61 insertions(+), 1 deletion(-) diff --git a/include/sysemu/guest-memfd-manager.h b/include/sysemu/ guest-memfd-manager.h index 9dc4e0346d..d1e7f698e8 100644 --- a/include/sysemu/guest-memfd-manager.h +++ b/include/sysemu/guest-memfd-manager.h @@ -42,6 +42,8 @@ struct GuestMemfdManager { struct GuestMemfdManagerClass { ObjectClass parent_class; + void (*realize)(GuestMemfdManager *gmm, MemoryRegion *mr, uint64_t region_size); + void (*unrealize)(GuestMemfdManager *gmm); int (*state_change)(GuestMemfdManager *gmm, uint64_t offset, uint64_t size, bool shared_to_private); }; @@ -61,4 +63,29 @@ static inline int guest_memfd_manager_state_change(GuestMemfdManager *gmm, uint6 return 0; } +static inline void guest_memfd_manager_realize(GuestMemfdManager *gmm, + MemoryRegion *mr, uint64_t region_size) +{ + GuestMemfdManagerClass *klass; + + g_assert(gmm); + klass = GUEST_MEMFD_MANAGER_GET_CLASS(gmm); + + if (klass->realize) { + klass->realize(gmm, mr, region_size);Ditch realize() hook and call guest_memfd_manager_realizefn() directly?Not clear why these new hooks are needed.+ } +} + +static inline void guest_memfd_manager_unrealize(GuestMemfdManager *gmm) +{ + GuestMemfdManagerClass *klass; + + g_assert(gmm); + klass = GUEST_MEMFD_MANAGER_GET_CLASS(gmm); + + if (klass->unrealize) { + klass->unrealize(gmm); + } +}guest_memfd_manager_unrealizefn()?Agree. Adding these wrappers seem unnecessary.+ #endif diff --git a/system/guest-memfd-manager.c b/system/guest-memfd- manager.c index 6601df5f3f..b6a32f0bfb 100644 --- a/system/guest-memfd-manager.c +++ b/system/guest-memfd-manager.c @@ -366,6 +366,31 @@ static int guest_memfd_state_change(GuestMemfdManager *gmm, uint64_t offset, return ret; }+static void guest_memfd_manager_realizefn(GuestMemfdManager *gmm,MemoryRegion *mr, + uint64_t region_size) +{ + uint64_t bitmap_size; + + gmm->block_size = qemu_real_host_page_size(); + bitmap_size = ROUND_UP(region_size, gmm->block_size) / gmm-block_size;imho unaligned region_size should be an assert.There's no guarantee the region_size of the MemoryRegion is PAGE_SIZE aligned. So the ROUND_UP() is more appropriate.It is all about DMA so the smallest you can map is PAGE_SIZE so even if you round up here, it is likely going to fail to DMA-map later anyway (or not?).Checked the handling of VFIO, if the size is less than PAGE_SIZE, it will just return and won't do DMA-map. Here is a different thing. It tries to calculate the bitmap_size. The bitmap is used to track the private/shared status of the page. So if the size is less than PAGE_SIZE, we still use the one bit to track this small-size range.+ + gmm->mr = mr; + gmm->bitmap_size = bitmap_size; + gmm->bitmap = bitmap_new(bitmap_size); + + memory_region_set_ram_discard_manager(gmm->mr, RAM_DISCARD_MANAGER(gmm)); +}This belongs to 2/7.+ +static void guest_memfd_manager_unrealizefn(GuestMemfdManager *gmm) +{ + memory_region_set_ram_discard_manager(gmm->mr, NULL); + + g_free(gmm->bitmap); + gmm->bitmap = NULL; + gmm->bitmap_size = 0; + gmm->mr = NULL;@gmm is being destroyed here, why bother zeroing?OK, will remove it.+} +This function belongs to 2/7.Will move both realizefn() and unrealizefn().Yes.static void guest_memfd_manager_init(Object *obj) { GuestMemfdManager *gmm = GUEST_MEMFD_MANAGER(obj); @@ -375,7 +400,6 @@ static void guest_memfd_manager_init(Object *obj) static void guest_memfd_manager_finalize(Object *obj) { - g_free(GUEST_MEMFD_MANAGER(obj)->bitmap); }static void guest_memfd_manager_class_init(ObjectClass *oc, void*data) @@ -384,6 +408,8 @@ static void guest_memfd_manager_class_init(ObjectClass *oc, void *data) RamDiscardManagerClass *rdmc = RAM_DISCARD_MANAGER_CLASS(oc); gmmc->state_change = guest_memfd_state_change; + gmmc->realize = guest_memfd_manager_realizefn; + gmmc->unrealize = guest_memfd_manager_unrealizefn; rdmc->get_min_granularity = guest_memfd_rdm_get_min_granularity; rdmc->register_listener = guest_memfd_rdm_register_listener; diff --git a/system/physmem.c b/system/physmem.c index dc1db3a384..532182a6dd 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -53,6 +53,7 @@ #include "sysemu/hostmem.h" #include "sysemu/hw_accel.h" #include "sysemu/xen-mapcache.h" +#include "sysemu/guest-memfd-manager.h" #include "trace.h" #ifdef CONFIG_FALLOCATE_PUNCH_HOLE @@ -1885,6 +1886,9 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) qemu_mutex_unlock_ramlist(); goto out_free; } + + GuestMemfdManager *gmm = GUEST_MEMFD_MANAGER(object_new(TYPE_GUEST_MEMFD_MANAGER)); + guest_memfd_manager_realize(gmm, new_block->mr, new_block-mr->size);Wow. Quite invasive.Yeah... It creates a manager object no matter whether the user wants tous e shared passthru or not. We assume some fields like private/shared bitmap may also be helpful in other scenario for future usage, and if nopassthru device, the listener would just return, so it is acceptable.Explain these other scenarios in the commit log please as otherwise making this an interface of HostMemoryBackendMemfd looks way cleaner. Thanks,Thanks for the suggestion. Until now, I think making this an interface of HostMemoryBackend is cleaner. The potential future usage for non-HostMemoryBackend guest_memfd-backed memory region I can think of is the the TEE I/O for iommufd P2P support? when it tries to initialize RAM device memory region with the attribute of shared/private. But I think it would be a long term story and we are not sure what it will be like in future.As raised in #2, I'm don't think this belongs into HostMemoryBackend. It kind-of belongs to the RAMBlock, but we could have another object (similar to virtio-mem currently managing a single HostMemoryBackend->RAMBlock) that takes care of that for multiple memory backends.
The vBIOS thingy confused me and then I confused others :) There are 2 things: 1) an interface or new subclass of HostMemoryBackendClass which we need to advertise and implement ability to discard pages; 2) RamDiscardManagerClass which is MR/Ramblock and does not really belong to HostMemoryBackend (as it is in what was posted ages ago).
I suggest Chenyi post a new version using the current approach with the comments and commitlogs fixed. Makes sense? Thanks,
-- Alexey
[Prev in Thread] | Current Thread | [Next in Thread] |