qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [qemu-s390x] [PATCH for-4.0?] exec: Only count mapped memory backend


From: Igor Mammedov
Subject: Re: [qemu-s390x] [PATCH for-4.0?] exec: Only count mapped memory backends for qemu_getrampagesize()
Date: Wed, 27 Mar 2019 10:38:38 +0100

On Wed, 27 Mar 2019 20:07:57 +1100
David Gibson <address@hidden> wrote:

> On Wed, Mar 27, 2019 at 09:57:45AM +0100, Igor Mammedov wrote:
> > On Wed, 27 Mar 2019 11:11:46 +1100
> > David Gibson <address@hidden> wrote:
> >   
> > > On Tue, Mar 26, 2019 at 03:08:20PM +0100, Igor Mammedov wrote:  
> > > > On Tue, 26 Mar 2019 14:50:58 +1100
> > > > David Gibson <address@hidden> wrote:
> > > >     
> > > > > qemu_getrampagesize() works out the minimum host page size backing 
> > > > > any of
> > > > > guest RAM.  This is required in a few places, such as for POWER8 PAPR 
> > > > > KVM
> > > > > guests, because limitations of the hardware virtualization mean the 
> > > > > guest
> > > > > can't use pagesizes larger than the host pages backing its memory.
> > > > > 
> > > > > However, it currently checks against *every* memory backend, whether 
> > > > > or not
> > > > > it is actually mapped into guest memory at the moment.  This is 
> > > > > incorrect.
> > > > > 
> > > > > This can cause a problem attempting to add memory to a POWER8 pseries 
> > > > > KVM
> > > > > guest which is configured to allow hugepages in the guest (e.g.
> > > > > -machine cap-hpt-max-page-size=16m).  If you attempt to add 
> > > > > non-hugepage,
> > > > > you can (correctly) create a memory backend, however it (correctly) 
> > > > > will
> > > > > throw an error when you attempt to map that memory into the guest by
> > > > > 'device_add'ing a pc-dimm.
> > > > > 
> > > > > What's not correct is that if you then reset the guest a startup check
> > > > > against qemu_getrampagesize() will cause a fatal error because of the 
> > > > > new
> > > > > memory object, even though it's not mapped into the guest.    
> > > > I'd say that backend should be remove by mgmt app since device_add 
> > > > failed
> > > > instead of leaving it to hang around. (but fatal error either not a nice
> > > > behavior on QEMU part)    
> > > 
> > > I agree, but reset could be guest initiated, so even if management is
> > > doing the right thing there's a window where it could break.  
> > 
> > We could (log term) get rid of qemu_getrampagesize() (it's not really good
> > to push machine/target specific condition into generic function) and move
> > pagesize calculation closer to machine. In this case machine (spapr) knows
> > exactly when and what is mapped and can enumerate NOT backends but mapped
> > memory regions and/or pc-dimms (lets say we have memory_region_page_size()
> > and apply whatever policy to the results.  
> 
> So.. it used to be in the machine specific code, but was made generic
> because it's used in the vfio code.  Where it is ppc specific, but not
> keyed into the machine specific code in a way that we can really
> re-use it there.
It probably was the easiest way to hack things together, but then API gets
generalized and misused and then tweaked to specific machine usecase
and it gets only worse over time.

> 
> > > > > This patch corrects the problem by adjusting 
> > > > > find_max_supported_pagesize()
> > > > > (called from qemu_getrampagesize() via object_child_foreach) to 
> > > > > exclude
> > > > > non-mapped memory backends.    
> > > > I'm not sure about if it's ok do so. It depends on where from
> > > > qemu_getrampagesize() is called. For example s390 calls it rather early
> > > > when initializing KVM, so there isn't anything mapped yet.    
> > > 
> > > Oh.  Drat.
> > >   
> > > > And once I replace -mem-path with hostmem backend and drop
> > > > qemu_mempath_getpagesize(mem_path) /which btw aren't guarantied to be 
> > > > mapped either/
> > > > this patch might lead to incorrect results for initial memory as
> > > > well.    
> > > 
> > > Uh.. I don't immediately see why.  
> > It depends on when qemu_getrampagesize() is called with this patch.
> > 
> > Maybe qemu_getrampagesize() is not a good interface anymore?
> > I also see it being called directly from device side hw/vfio/spapr.c, which
> > in my opinion should be propagated from machine,  
> 
> Um.. I'm not sure what you mean by that.
It would be better if machine would set a page size property on vfio device
when it's created.

> 
> > plus adhoc call
> > from kvm parts.  
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]