[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_ini
From: |
Eugenio Perez Martin |
Subject: |
Re: [PATCH v2 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_init |
Date: |
Tue, 2 Apr 2024 14:01:31 +0200 |
On Tue, Apr 2, 2024 at 8:19 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 2/14/2024 11:11 AM, Eugenio Perez Martin wrote:
> > On Wed, Feb 14, 2024 at 7:29 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
> >> Hi Michael,
> >>
> >> On 2/13/2024 2:22 AM, Michael S. Tsirkin wrote:
> >>> On Mon, Feb 05, 2024 at 05:10:36PM -0800, Si-Wei Liu wrote:
> >>>> Hi Eugenio,
> >>>>
> >>>> I thought this new code looks good to me and the original issue I saw
> >>>> with
> >>>> x-svq=on should be gone. However, after rebase my tree on top of this,
> >>>> there's a new failure I found around setting up guest mappings at early
> >>>> boot, please see attached the specific QEMU config and corresponding
> >>>> event
> >>>> traces. Haven't checked into the detail yet, thinking you would need to
> >>>> be
> >>>> aware of ahead.
> >>>>
> >>>> Regards,
> >>>> -Siwei
> >>> Eugenio were you able to reproduce? Siwei did you have time to
> >>> look into this?
> >> Didn't get a chance to look into the detail yet in the past week, but
> >> thought it may have something to do with the (internals of) iova tree
> >> range allocation and the lookup routine. It started to fall apart at the
> >> first vhost_vdpa_dma_unmap call showing up in the trace events, where it
> >> should've gotten IOVA=0x2000001000, but an incorrect IOVA address
> >> 0x1000 was ended up returning from the iova tree lookup routine.
> >>
> >> HVA GPA IOVA
> >> -------------------------------------------------------------------------------------------------------------------------
> >> Map
> >> [0x7f7903e00000, 0x7f7983e00000) [0x0, 0x80000000) [0x1000, 0x80000000)
> >> [0x7f7983e00000, 0x7f9903e00000) [0x100000000, 0x2080000000)
> >> [0x80001000, 0x2000001000)
> >> [0x7f7903ea0000, 0x7f7903ec0000) [0xfeda0000, 0xfedc0000)
> >> [0x2000001000, 0x2000021000)
> >>
> >> Unmap
> >> [0x7f7903ea0000, 0x7f7903ec0000) [0xfeda0000, 0xfedc0000) [0x1000,
> >> 0x20000) ???
> >> shouldn't it be [0x2000001000,
> >> 0x2000021000) ???
> >>
> It looks the SVQ iova tree lookup routine vhost_iova_tree_find_iova(),
> which is called from vhost_vdpa_listener_region_del(), can't properly
> deal with overlapped region. Specifically, q35's mch_realize() has the
> following:
>
> 579 memory_region_init_alias(&mch->open_high_smram, OBJECT(mch),
> "smram-open-high",
> 580 mch->ram_memory,
> MCH_HOST_BRIDGE_SMRAM_C_BASE,
> 581 MCH_HOST_BRIDGE_SMRAM_C_SIZE);
> 582 memory_region_add_subregion_overlap(mch->system_memory, 0xfeda0000,
> 583 &mch->open_high_smram, 1);
> 584 memory_region_set_enabled(&mch->open_high_smram, false);
>
> #0 0x0000564c30bf6980 in iova_tree_find_address_iterator
> (key=0x564c331cf8e0, value=0x564c331cf8e0, data=0x7fffb6d749b0) at
> ../util/iova-tree.c:96
> #1 0x00007f5f66479654 in g_tree_foreach () at /lib64/libglib-2.0.so.0
> #2 0x0000564c30bf6b53 in iova_tree_find_iova (tree=<optimized out>,
> map=map@entry=0x7fffb6d74a00) at ../util/iova-tree.c:114
> #3 0x0000564c309da0a9 in vhost_iova_tree_find_iova (tree=<optimized
> out>, map=map@entry=0x7fffb6d74a00) at ../hw/virtio/vhost-iova-tree.c:70
> #4 0x0000564c3085e49d in vhost_vdpa_listener_region_del
> (listener=0x564c331024c8, section=0x7fffb6d74aa0) at
> ../hw/virtio/vhost-vdpa.c:444
> #5 0x0000564c309f4931 in address_space_update_topology_pass
> (as=as@entry=0x564c31ab1840 <address_space_memory>,
> old_view=old_view@entry=0x564c33364cc0,
> new_view=new_view@entry=0x564c333640f0, adding=adding@entry=false) at
> ../system/memory.c:977
> #6 0x0000564c309f4dcd in address_space_set_flatview (as=0x564c31ab1840
> <address_space_memory>) at ../system/memory.c:1079
> #7 0x0000564c309f86d0 in memory_region_transaction_commit () at
> ../system/memory.c:1132
> #8 0x0000564c309f86d0 in memory_region_transaction_commit () at
> ../system/memory.c:1117
> #9 0x0000564c307cce64 in mch_realize (d=<optimized out>,
> errp=<optimized out>) at ../hw/pci-host/q35.c:584
>
> However, it looks like iova_tree_find_address_iterator() only check if
> the translated address (HVA) falls in to the range when trying to locate
> the desired IOVA, causing the first DMAMap that happens to overlap in
> the translated address (HVA) space to be returned prematurely:
>
> 89 static gboolean iova_tree_find_address_iterator(gpointer key,
> gpointer value,
> 90 gpointer data)
> 91 {
> :
> :
> 99 if (map->translated_addr + map->size < needle->translated_addr ||
> 100 needle->translated_addr + needle->size < map->translated_addr) {
> 101 return false;
> 102 }
> 103
> 104 args->result = map;
> 105 return true;
> 106 }
>
> In the QEMU trace file, it reveals that the first DMAMap as below gets
> returned incorrectly instead the second, the latter of which is what the
> actual IOVA corresponds to:
>
> HVA GPA
> IOVA
> [0x7f7903e00000, 0x7f7983e00000) [0x0, 0x80000000)
> [0x1000, 0x80001000)
> [0x7f7903ea0000, 0x7f7903ec0000) [0xfeda0000, 0xfedc0000)
> [0x2000001000, 0x2000021000)
>
I think the analysis is totally accurate as no code expects to unmap /
map overlapping regions. In particular, vdpa kernel does not expect it
either.
Since it is issued at _realize, it should be ok to unmap all the
region range and then map the right range again, even if that implies
a lot of unpin / pin.
>
> Maybe other than check the HVA range, we should also match GPA, or at
> least the size should exactly match?
>
The safe actions here would be to unmap all the memory chunk and then
map the overlap memory? Or am I missing something?
Another thing I don't get, is this reproducible in previous versions?
As far as I understand, this bug was never found before. I guess this
path of q35's mch_realize is recent?
Thanks!
> > Yes, I'm still not able to reproduce. In particular, I don't know how
> > how the memory listener adds a region and then release a region with a
> > different size. I'm talking about these log entries:
> >
> > 1706854838.154394:vhost_vdpa_listener_region_add vdpa: 0x556d45c75140
> > iova 0x0 llend 0x80000000 vaddr: 0x7f7903e00000 read-only: 0
> > 452:vhost_vdpa_listener_region_del vdpa: 0x556d45c75140 iova 0x0 llend
> > 0x7fffffff
> Didn't see a different size here, though if you referred to the
> discrepancy in the traces around llend, I thought the two between _add()
> and _del() would have to be interpreted differently due to:
>
> 3d1e4d34 "vhost_vdpa: fix the input in
> trace_vhost_vdpa_listener_region_del()"
>
> Regards,
> -Siwei
> > Is it possible for you to also trace the skipped regions? We should
> > add a debug trace there too...
> >
> > Thanks!
> >
> >> PS, I will be taking off from today and for the next two weeks. Will try
> >> to help out looking more closely after I get back.
> >>
> >> -Siwei
> >>> Can't merge patches which are known to break things ...
>