On 2/14/2024 11:11 AM, Eugenio Perez Martin wrote:
On Wed, Feb 14, 2024 at 7:29 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
Hi Michael,
On 2/13/2024 2:22 AM, Michael S. Tsirkin wrote:
On Mon, Feb 05, 2024 at 05:10:36PM -0800, Si-Wei Liu wrote:
Hi Eugenio,
I thought this new code looks good to me and the original issue I saw with
x-svq=on should be gone. However, after rebase my tree on top of this,
there's a new failure I found around setting up guest mappings at early
boot, please see attached the specific QEMU config and corresponding event
traces. Haven't checked into the detail yet, thinking you would need to be
aware of ahead.
Regards,
-Siwei
Eugenio were you able to reproduce? Siwei did you have time to
look into this?
Didn't get a chance to look into the detail yet in the past week, but
thought it may have something to do with the (internals of) iova tree
range allocation and the lookup routine. It started to fall apart at the
first vhost_vdpa_dma_unmap call showing up in the trace events, where it
should've gotten IOVA=0x2000001000, but an incorrect IOVA address
0x1000 was ended up returning from the iova tree lookup routine.
HVA GPA IOVA
-------------------------------------------------------------------------------------------------------------------------
Map
[0x7f7903e00000, 0x7f7983e00000) [0x0, 0x80000000) [0x1000, 0x80000000)
[0x7f7983e00000, 0x7f9903e00000) [0x100000000, 0x2080000000)
[0x80001000, 0x2000001000)
[0x7f7903ea0000, 0x7f7903ec0000) [0xfeda0000, 0xfedc0000)
[0x2000001000, 0x2000021000)
Unmap
[0x7f7903ea0000, 0x7f7903ec0000) [0xfeda0000, 0xfedc0000) [0x1000,
0x20000) ???
shouldn't it be [0x2000001000,
0x2000021000) ???
It looks the SVQ iova tree lookup routine vhost_iova_tree_find_iova(),
which is called from vhost_vdpa_listener_region_del(), can't properly
deal with overlapped region. Specifically, q35's mch_realize() has the
following:
579 memory_region_init_alias(&mch->open_high_smram, OBJECT(mch),
"smram-open-high",
580 mch->ram_memory,
MCH_HOST_BRIDGE_SMRAM_C_BASE,
581 MCH_HOST_BRIDGE_SMRAM_C_SIZE);
582 memory_region_add_subregion_overlap(mch->system_memory, 0xfeda0000,
583 &mch->open_high_smram, 1);
584 memory_region_set_enabled(&mch->open_high_smram, false);
#0 0x0000564c30bf6980 in iova_tree_find_address_iterator
(key=0x564c331cf8e0, value=0x564c331cf8e0, data=0x7fffb6d749b0) at
../util/iova-tree.c:96
#1 0x00007f5f66479654 in g_tree_foreach () at /lib64/libglib-2.0.so.0
#2 0x0000564c30bf6b53 in iova_tree_find_iova (tree=<optimized out>,
map=map@entry=0x7fffb6d74a00) at ../util/iova-tree.c:114
#3 0x0000564c309da0a9 in vhost_iova_tree_find_iova (tree=<optimized
out>, map=map@entry=0x7fffb6d74a00) at ../hw/virtio/vhost-iova-tree.c:70
#4 0x0000564c3085e49d in vhost_vdpa_listener_region_del
(listener=0x564c331024c8, section=0x7fffb6d74aa0) at
../hw/virtio/vhost-vdpa.c:444
#5 0x0000564c309f4931 in address_space_update_topology_pass
(as=as@entry=0x564c31ab1840 <address_space_memory>,
old_view=old_view@entry=0x564c33364cc0,
new_view=new_view@entry=0x564c333640f0, adding=adding@entry=false) at
../system/memory.c:977
#6 0x0000564c309f4dcd in address_space_set_flatview (as=0x564c31ab1840
<address_space_memory>) at ../system/memory.c:1079
#7 0x0000564c309f86d0 in memory_region_transaction_commit () at
../system/memory.c:1132
#8 0x0000564c309f86d0 in memory_region_transaction_commit () at
../system/memory.c:1117
#9 0x0000564c307cce64 in mch_realize (d=<optimized out>,
errp=<optimized out>) at ../hw/pci-host/q35.c:584
However, it looks like iova_tree_find_address_iterator() only check if
the translated address (HVA) falls in to the range when trying to locate
the desired IOVA, causing the first DMAMap that happens to overlap in
the translated address (HVA) space to be returned prematurely:
89 static gboolean iova_tree_find_address_iterator(gpointer key,
gpointer value,
90 gpointer data)
91 {
:
:
99 if (map->translated_addr + map->size < needle->translated_addr ||
100 needle->translated_addr + needle->size < map->translated_addr) {
101 return false;
102 }
103
104 args->result = map;
105 return true;
106 }
In the QEMU trace file, it reveals that the first DMAMap as below gets
returned incorrectly instead the second, the latter of which is what the
actual IOVA corresponds to:
HVA GPA
IOVA
[0x7f7903e00000, 0x7f7983e00000) [0x0, 0x80000000)
[0x1000, 0x80001000)
[0x7f7903ea0000, 0x7f7903ec0000) [0xfeda0000, 0xfedc0000)
[0x2000001000, 0x2000021000)