[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-mem
From: |
Chenyi Qiang |
Subject: |
Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager |
Date: |
Wed, 15 Jan 2025 14:15:19 +0800 |
User-agent: |
Mozilla Thunderbird |
On 1/15/2025 12:06 PM, Alexey Kardashevskiy wrote:
> On 10/1/25 17:38, Chenyi Qiang wrote:
>>
>>
>> On 1/10/2025 8:58 AM, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 9/1/25 15:29, Chenyi Qiang wrote:
>>>>
>>>>
>>>> On 1/9/2025 10:55 AM, Alexey Kardashevskiy wrote:
>>>>>
>>>>>
>>>>> On 9/1/25 13:11, Chenyi Qiang wrote:
>>>>>>
>>>>>>
>>>>>> On 1/8/2025 7:20 PM, Alexey Kardashevskiy wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 8/1/25 21:56, Chenyi Qiang wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 1/8/2025 12:48 PM, Alexey Kardashevskiy wrote:
>>>>>>>>> On 13/12/24 18:08, Chenyi Qiang wrote:
>>>>>>>>>> As the commit 852f0048f3 ("RAMBlock: make guest_memfd require
>>>>>>>>>> uncoordinated discard") highlighted, some subsystems like VFIO
>>>>>>>>>> might
>>>>>>>>>> disable ram block discard. However, guest_memfd relies on the
>>>>>>>>>> discard
>>>>>>>>>> operation to perform page conversion between private and shared
>>>>>>>>>> memory.
>>>>>>>>>> This can lead to stale IOMMU mapping issue when assigning a
>>>>>>>>>> hardware
>>>>>>>>>> device to a confidential VM via shared memory (unprotected memory
>>>>>>>>>> pages). Blocking shared page discard can solve this problem,
>>>>>>>>>> but it
>>>>>>>>>> could cause guests to consume twice the memory with VFIO,
>>>>>>>>>> which is
>>>>>>>>>> not
>>>>>>>>>> acceptable in some cases. An alternative solution is to convey
>>>>>>>>>> other
>>>>>>>>>> systems like VFIO to refresh its outdated IOMMU mappings.
>>>>>>>>>>
>>>>>>>>>> RamDiscardManager is an existing concept (used by virtio-mem) to
>>>>>>>>>> adjust
>>>>>>>>>> VFIO mappings in relation to VM page assignment. Effectively page
>>>>>>>>>> conversion is similar to hot-removing a page in one mode and
>>>>>>>>>> adding it
>>>>>>>>>> back in the other, so the similar work that needs to happen in
>>>>>>>>>> response
>>>>>>>>>> to virtio-mem changes needs to happen for page conversion events.
>>>>>>>>>> Introduce the RamDiscardManager to guest_memfd to achieve it.
>>>>>>>>>>
>>>>>>>>>> However, guest_memfd is not an object so it cannot directly
>>>>>>>>>> implement
>>>>>>>>>> the RamDiscardManager interface.
>>>>>>>>>>
>>>>>>>>>> One solution is to implement the interface in HostMemoryBackend.
>>>>>>>>>> Any
>>>>>>>>>
>>>>>>>>> This sounds about right.
>>>
>>> btw I am using this for ages:
>>>
>>> https://github.com/aik/qemu/
>>> commit/3663f889883d4aebbeb0e4422f7be5e357e2ee46
>>>
>>> but I am not sure if this ever saw the light of the day, did not it?
>>> (ironically I am using it as a base for encrypted DMA :) )
>>
>> Yeah, we are doing the same work. I saw a solution from Michael long
>> time ago (when there was still
>> a dedicated hostmem-memfd-private backend for restrictedmem/gmem)
>> (https://github.com/AMDESE/qemu/
>> commit/3bf5255fc48d648724d66410485081ace41d8ee6)
>>
>> For your patch, it only implement the interface for
>> HostMemoryBackendMemfd. Maybe it is more appropriate to implement it for
>> the parent object HostMemoryBackend, because besides the
>> MEMORY_BACKEND_MEMFD, other backend types like MEMORY_BACKEND_RAM and
>> MEMORY_BACKEND_FILE can also be guest_memfd-backed.
>>
>> Think more about where to implement this interface. It is still
>> uncertain to me. As I mentioned in another mail, maybe ram device memory
>> region would be backed by guest_memfd if we support TEE IO iommufd MMIO
>> in future. Then a specific object is more appropriate. What's your
>> opinion?
>
> I do not know about this. Unlike RAM, MMIO can only do "in-place
> conversion" and the interface to do so is not straight forward and VFIO
> owns MMIO anyway so the uAPI will be in iommufd, here is a gist of it:
>
> https://github.com/aik/linux/
> commit/89e45c0404fa5006b2a4de33a4d582adf1ba9831
>
> "guest request" is a communication channel from the VM to the secure FW
> (AMD's "PSP") to make MMIO allow encrypted access.
It is still uncertain how to implement the private MMIO. Our assumption
is the private MMIO would also create a memory region with
guest_memfd-like backend. Its mr->ram is true and should be managed by
RamdDiscardManager which can skip doing DMA_MAP in VFIO's region_add
listener.
>
>
>>>
>>>>>>>>>
>>>>>>>>>> guest_memfd-backed host memory backend can register itself in the
>>>>>>>>>> target
>>>>>>>>>> MemoryRegion. However, this solution doesn't cover the scenario
>>>>>>>>>> where a
>>>>>>>>>> guest_memfd MemoryRegion doesn't belong to the HostMemoryBackend,
>>>>>>>>>> e.g.
>>>>>>>>>> the virtual BIOS MemoryRegion.
>>>>>>>>>
>>>>>>>>> What is this virtual BIOS MemoryRegion exactly? What does it look
>>>>>>>>> like
>>>>>>>>> in "info mtree -f"? Do we really want this memory to be DMAable?
>>>>>>>>
>>>>>>>> virtual BIOS shows in a separate region:
>>>>>>>>
>>>>>>>> Root memory region: system
>>>>>>>> 0000000000000000-000000007fffffff (prio 0, ram): pc.ram KVM
>>>>>>>> ...
>>>>>>>> 00000000ffc00000-00000000ffffffff (prio 0, ram): pc.bios KVM
>>>>>>>
>>>>>>> Looks like a normal MR which can be backed by guest_memfd.
>>>>>>
>>>>>> Yes, virtual BIOS memory region is initialized by
>>>>>> memory_region_init_ram_guest_memfd() which will be backed by a
>>>>>> guest_memfd.
>>>>>>
>>>>>> The tricky thing is, for Intel TDX (not sure about AMD SEV), the
>>>>>> virtual
>>>>>> BIOS image will be loaded and then copied to private region.
>>>>>> After that,
>>>>>> the loaded image will be discarded and this region become useless.
>>>>>
>>>>> I'd think it is loaded as "struct Rom" and then copied to the MR-
>>>>> ram_guest_memfd() which does not leave MR useless - we still see
>>>>> "pc.bios" in the list so it is not discarded. What piece of code
>>>>> are you
>>>>> referring to exactly?
>>>>
>>>> Sorry for confusion, maybe it is different between TDX and SEV-SNP for
>>>> the vBIOS handling.
>>>>
>>>> In x86_bios_rom_init(), it initializes a guest_memfd-backed MR and
>>>> loads
>>>> the vBIOS image to the shared part of the guest_memfd MR.
>>>> For TDX, it
>>>> will copy the image to private region (not the vBIOS guest_memfd MR
>>>> private part) and discard the shared part. So, although the memory
>>>> region still exists, it seems useless.
>>>> It is different for SEV-SNP, correct? Does SEV-SNP manage the vBIOS in
>>>> vBIOS guest_memfd private memory?
>>>
>>> This is what it looks like on my SNP VM (which, I suspect, is the same
>>> as yours as hw/i386/pc.c does not distinguish Intel/AMD for this
>>> matter):
>>
>> Yes, the memory region object is created on both TDX and SEV-SNP.
>>
>>>
>>> Root memory region: system
>>> 0000000000000000-00000000000bffff (prio 0, ram): ram1 KVM gmemfd=20
>>> 00000000000c0000-00000000000dffff (prio 1, ram): pc.rom KVM gmemfd=27
>>> 00000000000e0000-000000001fffffff (prio 0, ram): ram1
>>> @00000000000e0000 KVM gmemfd=20
>>> ...
>>> 00000000ffc00000-00000000ffffffff (prio 0, ram): pc.bios KVM
>>> gmemfd=26
>>>
>>> So the pc.bios MR exists and in use (hence its appearance in "info mtree
>>> -f").
>>>
>>>
>>> I added the gmemfd dumping:
>>>
>>> --- a/system/memory.c
>>> +++ b/system/memory.c
>>> @@ -3446,6 +3446,9 @@ static void mtree_print_flatview(gpointer key,
>>> gpointer value,
>>> }
>>> }
>>> }
>>> + if (mr->ram_block && mr->ram_block->guest_memfd >= 0) {
>>> + qemu_printf(" gmemfd=%d", mr->ram_block->guest_memfd);
>>> + }
>>>
>>
>> Then I think the virtual BIOS is another case not belonging to
>> HostMemoryBackend which convince us to implement the interface in a
>> specific object, no?
>
> TBH I have no idea why pc.rom and pc.bios are separate memory regions
> but in any case why do these 2 areas need to be treated any different
> than the rest of RAM? Thanks,
I think no difference. That's why I suggest implementing the RDM
interface in a specific object to cover both instead of the only
HostMemoryBackend.
>
>
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, (continued)
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Alexey Kardashevskiy, 2025/01/08
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Chenyi Qiang, 2025/01/08
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Alexey Kardashevskiy, 2025/01/09
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Chenyi Qiang, 2025/01/10
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Xu Yilun, 2025/01/10
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Xu Yilun, 2025/01/10
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Chenyi Qiang, 2025/01/12
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Xu Yilun, 2025/01/13
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Chenyi Qiang, 2025/01/13
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Alexey Kardashevskiy, 2025/01/14
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager,
Chenyi Qiang <=
- Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, Chenyi Qiang, 2025/01/14
Re: [PATCH 2/7] guest_memfd: Introduce an object to manage the guest-memfd with RamDiscardManager, David Hildenbrand, 2025/01/13