[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
From: |
Eduardo Habkost |
Subject: |
Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode |
Date: |
Thu, 27 Aug 2020 15:07:52 -0400 |
On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 16:03:40 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > >
> > > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > > mode apicid decode.
> > > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > > become quite a
> > > > > > > > maintenance problem in the future. So, it was decided to remove
> > > > > > > > that code and
> > > > > > > > use the generic decode which works for majority of the
> > > > > > > > topology. Most of the
> > > > > > > > SPECed configuration would work just fine. With some non-SPECed
> > > > > > > > user inputs,
> > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > Here is the discussion thread.
> > > > > > > > c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/">https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > >
> > > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > > and use the generic
> > > > > > > > apicid decode.
> > > > > > >
> > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > it requires numa configuration (it's not optional)
> > > > > > > so we need an extra patch on top of this series to enfoce that,
> > > > > > > i.e:
> > > > > > >
> > > > > > > if (epyc && !numa)
> > > > > > > error("EPYC cpu requires numa to be configured")
> > > > > >
> > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > real world QEMU deployments. That is way too user hostile to
> > > > > > introduce
> > > > > > as a requirement.
> > > > > >
> > > > > > Why do we need to force this ? People have been successfuly using
> > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > >
> > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > obviously caused the world to come crashing down.
> > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > anyone reporting crashes yet.
> > > > >
> > > > >
> > > > > What other options do we have?
> > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > so old configs can keep broken topology (CPUID),
> > > > > while new ones would require -numa and produce correct topology.
> > > >
> > > > No, tieing this to machine types is not viable either. That is still
> > > > going to break essentially every single management application that
> > > > exists today using QEMU.
> > > for that we have deprecation process, so users could switch to new CLI
> > > that would be required.
> >
> > We could, but I don't find the cost/benefit tradeoff is compelling.
> >
> > There are so many places where we diverge from what bare metal would
> > do, that I don't see a good reason to introduce this breakage, even
> > if we notify users via a deprecation message.
> I find (3) and (4) good enough reasons to use deprecation.
>
> > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > create a single NUMA node if none was specified for new machine
> > types, such that there is no visible change or breakage to any
> > mgmt apps.
>
> (1) for configs that started without -numa &&|| without -smp dies>1,
> QEMU can do just that (enable auto_enable_numa).
Why exactly do we need auto_enable_numa with dies=1?
If I understand correctly, Babu said earlier in this thread[1]
that we don't need auto_enable_numa.
[1]
11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/">https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/
>
> (2) As for configs that are out of spec, I do not care much (junk in - junk
> out)
> (though not having to spend time on bug reports and debug issues, just to say
> it's not supported in the end, makes deprecation sound like a reasonable
> choice)
>
> (3) However if config matches bare metal i.e. CPU has more than 1 die and
> within
> dies limits (spec wise), QEMU has to produce valid CPUs.
> In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> on user's behalf. That's where we have to error out and ask for explicit
> numa configuration.
>
> For such configs, current code (since 5.0), will produce in the best case
> performance issues due to mismatching data in APICID, CPUID and ACPI tables,
> in the worst case issues might be related to invalid APIC ID if running on
> EPYC host
> and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> die_id(aka node_id) internally).
> I'd rather error out on nonsense configs earlier than debug such issues
> and than error out anyways later (upsetting more users).
>
The requirements are not clear to me. Is this just about making
CPU die_id match the NUMA node ID, or are there additional
constraints?
> (4)
> If I were non hobby user, I'd hate if QEMU allowed me to start invalid config,
> that I'd have to spend time on debugging issues (including performance ones),
> instead of clearly telling me what's wrong and how config should be corrected.
> I'd probably jump to another hypervisor that does the job right,
> instead of digging into QEMU codebase and CPU specs to figure out how
> to hack and configure it.
>
--
Eduardo
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, (continued)
Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Igor Mammedov, 2020/08/26
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Daniel P . Berrangé, 2020/08/26
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Igor Mammedov, 2020/08/26
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Daniel P . Berrangé, 2020/08/26
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Igor Mammedov, 2020/08/26
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Daniel P . Berrangé, 2020/08/26
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Eduardo Habkost, 2020/08/26
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Igor Mammedov, 2020/08/27
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode,
Eduardo Habkost <=
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Igor Mammedov, 2020/08/27
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Daniel P . Berrangé, 2020/08/28
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Eduardo Habkost, 2020/08/28
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Daniel P . Berrangé, 2020/08/28
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Eduardo Habkost, 2020/08/28
- Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Babu Moger, 2020/08/28
RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Babu Moger, 2020/08/26
Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Dr. David Alan Gilbert, 2020/08/26
RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Babu Moger, 2020/08/26
Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode, Igor Mammedov, 2020/08/27