qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [qemu-s390x] [PATCH v1] s390x/kvm: Configure page size after memory


From: Igor Mammedov
Subject: Re: [qemu-s390x] [PATCH v1] s390x/kvm: Configure page size after memory has actually been initialized
Date: Thu, 28 Mar 2019 12:39:24 +0100

On Thu, 28 Mar 2019 11:18:02 +0100
David Hildenbrand <address@hidden> wrote:

> On 28.03.19 01:24, David Gibson wrote:
> > On Wed, Mar 27, 2019 at 09:06:53PM +0100, David Hildenbrand wrote:  
> >> On 27.03.19 17:45, Igor Mammedov wrote:  
> >>> On Wed, 27 Mar 2019 14:59:44 +0100
> >>> David Hildenbrand <address@hidden> wrote:
> >>>  
> >>>> Right now we configure the pagesize quite early, when initializing KVM.
> >>>> This is long before system memory is actually allocated via
> >>>> memory_region_allocate_system_memory(), and therefore memory backends
> >>>> marked as mapped.
> >>>>
> >>>> Instead, let's configure the maximum page size after initializing
> >>>> memory in s390_memory_init(). cap_hpage_1m is still properly
> >>>> configured before creating any CPUs, and therefore before configuring
> >>>> the CPU model and eventually enabling CMMA.
> >>>>
> >>>> We might later want to replace qemu_getrampagesize() by another
> >>>> detection mechanism, e.g. only looking at mapped, initial memory.
> >>>> We don't support any memory devices yet, and if so, we can always reject
> >>>> devices with a page size bigger than the initial page size when
> >>>> hotplugging. qemu_getrampagesize() should work for now, especially when
> >>>> converting it to only look at mapped backends.
> >>>>
> >>>> Signed-off-by: David Hildenbrand <address@hidden>  
> >>>
> >>> Acked-by: Igor Mammedov <address@hidden>  
> >>
> >> BTW, do we want
> >>
> >> qemu_getmaxrampagesize()
> >> qemu_getminrampagesize()  
> > 
> > That could work.
> >   
> >> or similar. qemu_getrampagesize() in its current form is really far from
> >> beautiful.  
> > 
> > Yeah, and AFAICT the way it's used here remains incorrect.
> >   
> 
> Soooooo,
> 
> this is all a big mess :)
> 
> 
> 1. We have to decide on the page size before initializing the CPU model
> (due to CMMA) and before creating any CPUs -> We have to do it
> early/before machine init.
> 
> 2. Memory devices (if ever supported) are created + realized (+ backends
> mapped) after machine init.
> 
> 3. Memory backends are created delayed, so even after creating devices.
> 
> All we can do for s390x is
> 
> a) Use the page size of initial memory when configuring the page size in
> KVM. (AFAICS what would be done right now)
> 
> b) When cold/hotplugging memory devices, reject devices with a page size
> bigger than the page size of initial memory. Not an issue now. In the
> future it might be "unfortunate" but at least we can print a nice error
> from the machine hotplug handler "page size not supported in this
> configuration".
> 
> I double checked, "-numa" is not supported in s390x. So "-numa
> node,mempath=..." does not apply. However this might change and we can
> easily forget about this. Also, getting rid of -mem-path might be one
> issue. So the right think to do would be
> 
> a) this patch
> b) introduce and use qemu_getmaxrampagesize()
> 
> Then, we would properly detect the maximum page size *for initial
> memory*. Memory devices, we will have to worry about in the future, but
> should be easy to handle (remember and check against maximum configured
>  page size).

Problem David tries to solve is that 
 1: user hotplugged backend with wrong page size
 2: hotplug of associated device 'pc-dimm' cleanly fails with error
 3: guest does reboot and QEMU crashes with fatal error

Issue is that qemu_getmaxrampagesize() iterates over backends which
might be used for guest RAM or theoretically might be used for
something else.

using 'mapped' attribute is a hack that currently works around issue [3]
but it's prone to being incorrect depending on call order.

maybe we should prevent [1] in the first place (but making backends
depend on machine/accel doesn't look good either)?

> Thoughts?
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]