qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()


From: Greg Kurz
Subject: Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()
Date: Wed, 25 Nov 2020 10:33:37 +0100

On Tue, 24 Nov 2020 18:56:02 +0100
Cédric Le Goater <clg@kaod.org> wrote:

[...]

> > 
> > I guess you're talking about KVM_DEV_XIVE_NR_SERVERS in
> > kvmppc_xive_connect() actually. We're currently passing
> > spapr_max_server_number() (vCPU id) but you might be
> > right.
> > 
> > I need to re-read the story around VSMT and XIVE.
> 
> ok. What we care about here, is a size to allocate the NVT block
> representing the vCPUs in HW. NVT ids are pushed in the thread 
> contexts when the vCPUs are scheduled to run and looked for by 
> the presenter when an interrupt is to be delivered.
> 

Yeah, looking at the code again, I realize there was a confusion
when we added the possibility to size the NVT. This should not
depend on the vCPU id limnit as it is done today, it should just
be the maximum number of possible vCPUs (ie. smp.max_cpus).

[...]
> > 
> > The difference here is that the guest doesn't claim IPIs. They are
> > supposedly pre-claimed in "ibm,xive-lisn-ranges". And this is actually
> > the case in QEMU.
> 
> yes. That's what I want to change (for performance)
> 

I understand the purpose of claiming the IPI from its associated
vCPU context. But this can only be done on a path where we have
both the vCPU id and the IPI ; kvmppc_xive_set_source_config()
looks like a good candidate to handle this for runtime and
post-load.

> > The IPI setup sequence in the guest is basically:
> > 1) grab a free irq from the bitmap, ie. "ibm,xive-lisn-ranges"
> > 2) calls H_INT_GET_SOURCE_INFO, ie. populate_irq_data()
> > 3) calls H_INT_SET_SOURCE_CONFIG, ie, configure_irq())
> > 
> > If we want an IPI to be claimed by the appropriate vCPU, we
> > can only do this from under H_INT_SET_SOURCE_CONFIG. And
> > until the guest eventually configures the IPI, KVM and QEMU
> > are out of sync.
> 
> Well, KVM doesn't know either how much PCI MSIs will be claimed.
> It all depends on the guest OS. 
> 

Yes but QEMU and KVM are always in sync for them. When the
guest calls the "ibm,change-msi" RTAS interface to get some
MSIs for a device, they are immediately claimed both in
QEMU and KVM.

> I don't think this is a problem to expose unclaimed interrupt
> numbers to the guest if they are IPIs. We can detect that
> easily with the range and claim the source at KVM level when 
> it's configured or in h_int_get_source_info(). Talking of which, 

We cannot claim the source at the KVM level from
H_INT_GET_SOURCE_INFO because we don't know about
the vCPU id here => we can't do the run_on_cpu()
optimization.

> it might be good to have a KVM command to query the source 
> characteristics on the host. I sent a patchset a while ago in 
> that sense.
> 
> > This complexifies migration because we have to guess at
> > post load if we should claim the IPI in KVM or not. The
> > simple presence of the vCPU isn't enough : we need to
> > guess if the guest actually configured the IPI or not.
> 
> The EAT will be transferred from the source and the call to 
> kvmppc_xive_source_reset_one() should initialize the KVM 
> device correctly on the target for all interrupts.
> 

Except that the EAS appears as valid for all IPIs, even
though the source didn't claim them at the KVM level. It
looks wrong to blindly restore all of them in post-load.

> >> All this to say, that we need to size better the range in the 
> >> "ibm,xive-lisn-ranges" property if that's broken for vSMT. 
> >>
> > 
> > Sizing the range to smp.max_cpus as proposed in this series
> > is fine, no matter what the VSMT is.
> 
> ok. That's a fix for spapr_irq_dt() then. And possibly, there 
> is a similar one for KVM_DEV_XIVE_NR_SERVERS.
>  

Yup.

> >> Then, I think the IPIs can be treated just like the PCI MSIs
> >> but they need to be claimed first. That's the ugly part. 
> >>
> > 
> > Yeah that's the big difference. For PCI MSIs, QEMU owns the
> > bitmap and the guest can claim (or release) a number of
> > MSIs the "ibm,change-msi" RTAS interface. There's no
> > such thing for IPIs : they are supposedly already claimed.
> 
> IPIs are a bit special because there are no I/O devices to
> claim them. We could consider the vCPU has being the device. 
> That was my first attempt but it was wrong since the OS is 
> in charge of choosing an interrupt number for the IPI. 
> 
> >> Should we add a special check in h_int_set_source_config to
> >> deal with unclaimed IPIs that are being configured ?
> >>
> > 
> > This is what my tentative fix does.
> 
> I didn't understand the complexity of it, may be due to my
> patchset.
> 
> You should try again :)
> 

Will do when I've fixed the misuses of spapr_max_server_number().

> C.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]