[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 0/4] overcommit: introduce mem-lock-onfault
From: |
Peter Xu |
Subject: |
Re: [PATCH v4 0/4] overcommit: introduce mem-lock-onfault |
Date: |
Thu, 23 Jan 2025 11:31:13 -0500 |
On Thu, Jan 23, 2025 at 04:19:40PM +0300, Daniil Tatianin wrote:
> Currently, passing mem-lock=on to QEMU causes memory usage to grow by
> huge amounts:
>
> no memlock:
> $ ./qemu-system-x86_64 -overcommit mem-lock=off
> $ ps -p $(pidof ./qemu-system-x86_64) -o rss=
> 45652
>
> $ ./qemu-system-x86_64 -overcommit mem-lock=off -enable-kvm
> $ ps -p $(pidof ./qemu-system-x86_64) -o rss=
> 39756
>
> memlock:
> $ ./qemu-system-x86_64 -overcommit mem-lock=on
> $ ps -p $(pidof ./qemu-system-x86_64) -o rss=
> 1309876
>
> $ ./qemu-system-x86_64 -overcommit mem-lock=on -enable-kvm
> $ ps -p $(pidof ./qemu-system-x86_64) -o rss=
> 259956
>
> This is caused by the fact that mlockall(2) automatically
> write-faults every existing and future anonymous mappings in the
> process right away.
>
> One of the reasons to enable mem-lock is to protect a QEMU process'
> pages from being compacted and migrated by kcompactd (which does so
> by messing with a live process page tables causing thousands of TLB
> flush IPIs per second) basically stealing all guest time while it's
> active.
>
> mem-lock=on helps against this (given compact_unevictable_allowed is 0),
> but the memory overhead it introduces is an undesirable side effect,
> which we can completely avoid by passing MCL_ONFAULT to mlockall, which
> is what this series allows to do with a new option for mem-lock called
> on-fault.
>
> memlock-onfault:
> $ ./qemu-system-x86_64 -overcommit mem-lock=on-fault
> $ ps -p $(pidof ./qemu-system-x86_64) -o rss=
> 54004
>
> $ ./qemu-system-x86_64 -overcommit mem-lock=on-fault -enable-kvm
> $ ps -p $(pidof ./qemu-system-x86_64) -o rss=
> 47772
>
> You may notice the memory usage is still slightly higher, in this case
> by a few megabytes over the mem-lock=off case. I was able to trace this
> down to a bug in the linux kernel with MCL_ONFAULT not being honored for
> the early process heap (with brk(2) etc.) so it is still write-faulted in
> this case, but it's still way less than it was with just the mem-lock=on.
>
> Changes since v1:
> - Don't make a separate mem-lock-onfault, add an on-fault option to
> mem-lock instead
>
> Changes since v2:
> - Move overcommit option parsing out of line
> - Make enable_mlock an enum instead
>
> Changes since v3:
> - Rebase to latest master due to the recent sysemu -> system renames
>
> Daniil Tatianin (4):
> os: add an ability to lock memory on_fault
> system/vl: extract overcommit option parsing into a helper
> system: introduce a new MlockState enum
> overcommit: introduce mem-lock=on-fault
>
> hw/virtio/virtio-mem.c | 2 +-
> include/system/os-posix.h | 2 +-
> include/system/os-win32.h | 3 ++-
> include/system/system.h | 12 ++++++++-
> migration/postcopy-ram.c | 4 +--
> os-posix.c | 10 ++++++--
> qemu-options.hx | 14 +++++++----
> system/globals.c | 12 ++++++++-
> system/vl.c | 52 +++++++++++++++++++++++++++++++--------
> 9 files changed, 87 insertions(+), 24 deletions(-)
Considering it's very mem relevant change and looks pretty benign.. I can
pick this if nobody disagrees (or beats me to it, which I'd appreciate).
I'll also provide at least one week for people to stop me.
Thanks,
--
Peter Xu