[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration chec
From: |
Igor Mammedov |
Subject: |
Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks" |
Date: |
Fri, 23 Sep 2022 10:20:34 +0200 |
On Thu, 22 Sep 2022 12:40:01 -0400
Peter Xu <peterx@redhat.com> wrote:
> On Thu, Sep 22, 2022 at 03:46:17PM +0200, Igor Mammedov wrote:
> > On Wed, 21 Sep 2022 12:12:27 -0400
> > Peter Xu <peterx@redhat.com> wrote:
> >
> > > It's true that when vcpus<=255 we don't require the length of 32bit APIC
> > > IDs. However here since we already have EIM=ON it means the hypervisor
> > > will declare the VM as x2apic supported (e.g. VT-d ECAP register will have
> > > EIM bit 4 set), so the guest should assume the APIC IDs are 32bits width
> > > even if vcpus<=255. In short, commit 77250171bdc breaks any simple
> > > cmdline
> > > that wants to boot a VM with >=9 but <=255 vcpus with:
> > >
> > > -device intel-iommu,intremap=on
> > >
> > > For anyone who does not want to enable x2apic, we can use eim=off in the
> > > intel-iommu parameters to skip enabling KVM x2apic.
> > >
> > > This partly reverts commit 77250171bdc02aee106083fd2a068147befa1a38, while
> > > keeping the valid bit on checking split irqchip, but revert the other
> > > change.
> > >
> > > Cc: David Woodhouse <dwmw2@infradead.org>
> > > Cc: Claudio Fontana <cfontana@suse.de>
> > > Cc: Igor Mammedov <imammedo@redhat.com>
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > > hw/i386/intel_iommu.c | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > index 05d53a1aa9..6524c2ee32 100644
> > > --- a/hw/i386/intel_iommu.c
> > > +++ b/hw/i386/intel_iommu.c
> > > @@ -3818,6 +3818,11 @@ static bool vtd_decide_config(IntelIOMMUState *s,
> > > Error **errp)
> > > error_setg(errp, "eim=on requires
> > > accel=kvm,kernel-irqchip=split");
> > > return false;
> > > }
> > > + if (!kvm_enable_x2apic()) {
> >
> > above 'check' has side-effects
> > if it's supposed to be a check it would be better to use
> > kvm_has_x2apic_api()
> > instead.
>
> A check is not what I wanted.
>
> As stated in the commit message, since for some reason EIM is enabled on
> the VT-d device already, we need to enable x2apic for the whole guest
> (including KVM) to match with EIM=on being declared even if vcpus<255.
>
> >
> > Also 77250171bdc says:
> > "
> > The check on kvm_enable_x2apic() needs to happen *anyway* in order to
> > allow CPUs above 254 even without an IOMMU, so allow that to happen
> > elsewhere.
> > "
> >
> > Looking for that elsewhere, it looks like commit dc89f32d92b was supposed
> > to take care of removed chunk, but that is not reachable because of > 255
> > vCPUs"
> >
> > Likely 77250171bdc just exposed a bug in dc89f32d92b, where
> > the later removed kvm_enable_x2apic() always called (with split irqchip)
> > and made it called only when > 255 vCPUs.
> >
> > So migration wise it looks like all version with it and less than 255 cpus
> > are broken.
> >
> > Wait earlier c1bb5418e3, introduced that
> > kvm_irqchip_is_split() && kvm_enable_x2apic()
> > 'condition', also without any compat machinery to keep old behavior.
>
> There's actually some attempt to be compatible, with:
>
> GlobalProperty pc_compat_5_1[] = {
> { "ICH9-LPC", "x-smi-cpu-hotplug", "off" },
> + { TYPE_X86_CPU, "kvm-msi-ext-dest-id", "off" },
> };
>
> But I don't think that's strictly correct, because afaict that only
> controls whether guest enables it or not (I can only see Linux does it that
> way; no idea whether that's detected to other OSes from the PV interfaces).
> The KVM x2apic will still be enabled even on old machines I think, as you
> mentioned.
yep, that compat affects only kvm-msi-ext-dest-id, the kvm_enable_x2apic()
was still called regardless of that.
> > And before that kvm_enable_x2apic() was affecting only configuration
> > with intel_iommu (fb506e701e9b).
>
> Right, afaict that's what we "officially" support.
>
> >
> > I'm not sure if anything could be salvaged here migration wise
>
> This whole thing is indeed very unfortunate. For easier reference of
> future, here are the list of commits that are relevant in time order:
>
> fb506e701e ("intel_iommu: reject broken EIM", 2016-10-17)
> c1bb5418e3 ("target/i386: Support up to 32768 CPUs without IRQ remapping",
> 2020-12-10)
> 77250171bd ("intel_iommu: Fix irqchip / X2APIC configuration checks",
> 2022-05-16)
> dc89f32d92 ("target/i386: Fix sanity check on max APIC ID / X2APIC
> enablement", 2022-05-16)
>
> So regarding compatibility I'm wondering whether we should loose it for
> this case, depending on whether vendors (like RH, or QEMU community in
> general) should support "allowing 32K vcpu without vIOMMU" as an "official"
> feature, or treat it as "experimental only".
If I recall correctly that's PV only feature that also requires special
tailored guest. i.e. it's possible to deliver IPIs in such configuration
but devices could interact only with a fraction of CPUs (irq-wise) or
something else should take care of IRQ remapping.
I don't think mainstream distributions care about this case much.
> IMO it's more important to always make the officially supported bits
> compatible and work as expected. Here IIUC the (only) official way to
> support >255 vcpus should still be using vIOMMU with EIM enabled so far.
> But I'm happy to be corrected..
>
> If so, I would still suggest having such a patch because it should fix one
> of the basic use cases with vIOMMU and currently upstream is broken on it.
> This patch will definitely change the behavior again, but the old was
> simply broken and we don't really have much choice out of it, IMHO.
The thing I worry about is calling kvm_enable_x2apic() from multiple places.
It would be better to fix x86_cpus_init() part of dc89f32d92 and have a check
in IOMMU. But if that's impossible/too ugly, I won't object to this patch.
> Thanks,
>
> >
> > PS:
> > I'd keep kvm_enable_x2apic() only in corrected x86_cpus_init()
> > and use kvm_has_x2apic_api() elsewhere for checks and bailing out.
> >
> >
> > > + error_setg(errp, "eim=on requires support on the KVM side"
> > > + "(X2APIC_API, first shipped in v4.7)");
> > > + return false;
> > > + }
> >
> >
> >
> > > }
> > >
> > > /* Currently only address widths supported are 39 and 48 bits */
> >
>
- [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks", Peter Xu, 2022/09/21
- Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks", Jason Wang, 2022/09/21
- Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks", Igor Mammedov, 2022/09/22
- Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks", Peter Xu, 2022/09/22
- Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks",
Igor Mammedov <=
- Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks", Igor Mammedov, 2022/09/23
- Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks", Peter Xu, 2022/09/23
- Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks", Peter Xu, 2022/09/23
- Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks", Igor Mammedov, 2022/09/26
- Re: [PATCH] Revert "intel_iommu: Fix irqchip / X2APIC configuration checks", Peter Xu, 2022/09/26