Hi list,
I upgraded the following qemu packages on my debian bullseye server:
qemu-system-common_1%3a6.2+dfsg-2~bpo11+1_amd64.deb qemu-system-data_1%3a6.2+dfsg-2~bpo11+1_all.deb qemu-system-x86_1%3a6.2+dfsg-2~bpo11+1_amd64.deb
to the following bebian bullseye backports packages (no other updates performed, kernel and everything is the same):
qemu-system-common_1%3a7.0+dfsg-2~bpo11+2_amd64.deb qemu-system-data_1%3a7.0+dfsg-2~bpo11+2_all.deb qemu-system-x86_1%3a7.0+dfsg-2~bpo11+2_amd64.deb
and suddenly my server is out of memory (virtual machines are killed immediately after the boot) - more precisely the VMs are now using 2-7x more RSS than with qemu version 6.2 (acording to "virsh dommemstat").
My configuration is (commands are executed with qemu 6.2 running):
Host Memory:
64GiB RAM
root:~# grep HugePages_ /proc/meminfo HugePages_Total: 21504 HugePages_Free: 9728 HugePages_Rsvd: 0 HugePages_Surp: 0
root:~# free total used free shared buff/cache available Mem: 65460392 50713128 7500228 7320 7247036 13986392 Swap: 124997628 1816 124995812
Running virtual machines: 14
sum of "memory" (= "currentMemory") elements (for all of them) is: 23GiB sum of "memtune -> hard_limit" elements (for all of them) is: 47GiB
Because I use AMD-SEV the memory of VMs is locked (+ hugepages are used) - example of memory configuration: <memory dumpCore='off' unit='KiB'>2097152</memory> <currentMemory unit='KiB'>2097152</currentMemory> <memtune> <hard_limit unit='KiB'>4194304</hard_limit> </memtune> <memoryBacking> <hugepages> <page size='2048' unit='KiB'/> </hugepages> <nosharepages/> <locked/> <source type='memfd'/> <access mode='shared'/> <allocation mode='immediate'/> <discard/> </memoryBacking>
other (maybe related) configration fragments: <vcpu placement='static'>8</vcpu> <iothreads>1</iothreads> <os firmware='efi'> <type arch='x86_64' machine='pc-q35-6.2'>hvm</type> <firmware> <feature enabled='yes' name='enrolled-keys'/> <feature enabled='yes' name='secure-boot'/> </firmware> <loader secure='yes'/> <nvram>/var/lib/libvirt/qemu/nvram/vm01_VARS.fd</nvram> <bootmenu enable='yes' timeout='5000'/> <bios useserial='yes'/> <smbios mode='emulate'/> </os> <features> <acpi/> <apic/> <pae/> <hap state='on'/> <privnet/> <kvm> <hidden state='off'/> <hint-dedicated state='on'/> <poll-control state='on'/> </kvm> <pvspinlock state='on'/> <smm state='on'/> <ioapic driver='kvm'/> </features>
and the AMD-SEV related fragment (+all contollers use the required "<driver iommu='on'/>"):
<launchSecurity type='sev' kernelHashes='yes'> <cbitpos>51</cbitpos> <reducedPhysBits>1</reducedPhysBits> <policy>0x0033</policy> </launchSecurity>
Here is the comparison of the RSS ("virsh dommemstat") usage before / after the upgrade, with no other software and/or configuration changes. The output format is: vm_name, [ memory / currentMemory / memtune-hard_limit ]: rss qemu_6.2 => qemu_7.0 ...
vm01 [ 512MiB / 512MiB / 2048MiB ]: rss 376176 => 1069644 (2.84 x) vm02 [ 512MiB / 512MiB / 2560MiB ]: rss 374468 => 1081008 (2.88 x) vm03 [ 512MiB / 512MiB / 2560MiB ]: rss 321964 => 1086200 (3.37 x) vm04 [ 512MiB / 512MiB / 2560MiB ]: rss 281308 => 1087332 (3.87 x) vm05 [ 512MiB / 512MiB / 2560MiB ]: rss 282096 => 1101924 (3.91 x) vm06 [ 1024MiB / 1024MiB / 3072MiB ]: rss 319052 => 1103500 (3.46 x) vm07 [ 2048MiB / 2048MiB / 4096MiB ]: rss 445104 => 1985368 (4.46 x) vm08 [ 2560MiB / 2560MiB / 4096MiB ]: rss 292232 => 1970060 (6.74 x) vm09 [ 4096MiB / 4096MiB / 6144MiB ]: rss 280268 => 806096 (2.87 x) vm10 [ 4096MiB / 4096MiB / 6144MiB ]: rss 410100 => 1695748 (4.13 x) vm11 [ 4096MiB / 4096MiB / 6144MiB ]: rss 339064 => 1136592 (3.35 x) vm12 [ 512MiB / 512MiB / 1024MiB ]: rss 299784 => killed (?.?? x) vm13 [ 512MiB / 512MiB / 1024MiB ]: rss 290816 => killed (?.?? x) vm14 [ 2048MiB / 2048MiB / 4096MiB ]: rss 453668 => killed (?.?? x)
Can someone explain me what causes the extreme memory usage, if no configuration (or other softver) changes are performed (only the qemu 6.2 upgrade to 7.0)?
I case that it helps I can privately share the full XML definition of my VMs ...
Thanks!
JM
|