[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [qemu-s390x] [PATCH v1] s390x/kvm: Configure page size after memory
From: |
Igor Mammedov |
Subject: |
Re: [qemu-s390x] [PATCH v1] s390x/kvm: Configure page size after memory has actually been initialized |
Date: |
Thu, 28 Mar 2019 12:39:24 +0100 |
On Thu, 28 Mar 2019 11:18:02 +0100
David Hildenbrand <address@hidden> wrote:
> On 28.03.19 01:24, David Gibson wrote:
> > On Wed, Mar 27, 2019 at 09:06:53PM +0100, David Hildenbrand wrote:
> >> On 27.03.19 17:45, Igor Mammedov wrote:
> >>> On Wed, 27 Mar 2019 14:59:44 +0100
> >>> David Hildenbrand <address@hidden> wrote:
> >>>
> >>>> Right now we configure the pagesize quite early, when initializing KVM.
> >>>> This is long before system memory is actually allocated via
> >>>> memory_region_allocate_system_memory(), and therefore memory backends
> >>>> marked as mapped.
> >>>>
> >>>> Instead, let's configure the maximum page size after initializing
> >>>> memory in s390_memory_init(). cap_hpage_1m is still properly
> >>>> configured before creating any CPUs, and therefore before configuring
> >>>> the CPU model and eventually enabling CMMA.
> >>>>
> >>>> We might later want to replace qemu_getrampagesize() by another
> >>>> detection mechanism, e.g. only looking at mapped, initial memory.
> >>>> We don't support any memory devices yet, and if so, we can always reject
> >>>> devices with a page size bigger than the initial page size when
> >>>> hotplugging. qemu_getrampagesize() should work for now, especially when
> >>>> converting it to only look at mapped backends.
> >>>>
> >>>> Signed-off-by: David Hildenbrand <address@hidden>
> >>>
> >>> Acked-by: Igor Mammedov <address@hidden>
> >>
> >> BTW, do we want
> >>
> >> qemu_getmaxrampagesize()
> >> qemu_getminrampagesize()
> >
> > That could work.
> >
> >> or similar. qemu_getrampagesize() in its current form is really far from
> >> beautiful.
> >
> > Yeah, and AFAICT the way it's used here remains incorrect.
> >
>
> Soooooo,
>
> this is all a big mess :)
>
>
> 1. We have to decide on the page size before initializing the CPU model
> (due to CMMA) and before creating any CPUs -> We have to do it
> early/before machine init.
>
> 2. Memory devices (if ever supported) are created + realized (+ backends
> mapped) after machine init.
>
> 3. Memory backends are created delayed, so even after creating devices.
>
> All we can do for s390x is
>
> a) Use the page size of initial memory when configuring the page size in
> KVM. (AFAICS what would be done right now)
>
> b) When cold/hotplugging memory devices, reject devices with a page size
> bigger than the page size of initial memory. Not an issue now. In the
> future it might be "unfortunate" but at least we can print a nice error
> from the machine hotplug handler "page size not supported in this
> configuration".
>
> I double checked, "-numa" is not supported in s390x. So "-numa
> node,mempath=..." does not apply. However this might change and we can
> easily forget about this. Also, getting rid of -mem-path might be one
> issue. So the right think to do would be
>
> a) this patch
> b) introduce and use qemu_getmaxrampagesize()
>
> Then, we would properly detect the maximum page size *for initial
> memory*. Memory devices, we will have to worry about in the future, but
> should be easy to handle (remember and check against maximum configured
> page size).
Problem David tries to solve is that
1: user hotplugged backend with wrong page size
2: hotplug of associated device 'pc-dimm' cleanly fails with error
3: guest does reboot and QEMU crashes with fatal error
Issue is that qemu_getmaxrampagesize() iterates over backends which
might be used for guest RAM or theoretically might be used for
something else.
using 'mapped' attribute is a hack that currently works around issue [3]
but it's prone to being incorrect depending on call order.
maybe we should prevent [1] in the first place (but making backends
depend on machine/accel doesn't look good either)?
> Thoughts?
>