qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6] i386/cpu: fixup number of addressable IDs for logical pro


From: Xiaoyao Li
Subject: Re: [PATCH v6] i386/cpu: fixup number of addressable IDs for logical processors in the physical package
Date: Mon, 14 Oct 2024 09:32:45 +0800
User-agent: Mozilla Thunderbird

On 10/14/2024 8:36 AM, Xiaoyao Li wrote:
On 10/12/2024 5:35 PM, Chuang Xu wrote:

On 10/12/24 下午4:21, Xiaoyao Li wrote:
On 10/9/2024 11:56 AM, Chuang Xu wrote:
When QEMU is started with:
-cpu host,migratable=on,host-cache-info=on,l3-cache=off
-smp 180,sockets=2,dies=1,cores=45,threads=2

On Intel platform:
CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
logical processors in the physical package".

When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
CPUID.04H.EAX[31:26], which matches the expected result.

As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2
integer too. Otherwise we may encounter unexpected results in guest.

For example, when QEMU is started with CLI above and xtopology is disabled, guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/ (1+CPUID.04H.EAX[31:26]) to calculate threads-per-core in detect_ht(). Then guest will get "90/ (1+63)=1"
as the result, even though threads-per-core should actually be 2.

It's kernel's bug instead.

In 1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel 64 Architecture Processor Topology Enumeration" [1], it is

  - SMT_Mask_Width = Log2(RoundToNearestPof2(CPUID.1:EBX[23:16])/ (CPUID.(EAX=4,ECX=0):EAX[31:26]) + 1))

The value of CPUID.1:EBX[23:16] needs to be *rounded* to the neartest power-of-two integer instead of itself being the power-of-two.

This also is consistency with the SDM, where the comment for bit 23-16 of CPUID.1:EBX is:

  The nearest power-of-2 integer that is not smaller than EBX[23:16] is
  the number of unique initial APIC IDs reserved for addressing
  different logical processors in a physical package.

What I read from this is, the nearest power-of-2 integer that is not smaller than EBX[23:16] is a different thing than EBX[23:16]. i.e.,

Yes, when I read sdm, I also thought it was a kernel bug. But on my 192ht spr host, the value of CPUID.1:EBX[23:16] was indeed rounded up

to the nearest power of 2 by the hardware. After communicating with Intel technical support staff, we thought that perhaps the description in sdm

is not so accurate, and rounding up CPUID.1:EBX[23:16] to the power of 2 in qemu may be more in line with the hardware behavior.

I think above justification is important. We need to justify our changes with the fact and correct reason.

I somehow agree to set EBX[23:16] to a value of power-of-2, because the 1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel 64 Architecture Processor Topology Enumeration" spec says

     CPUID.1:EBX[23:16] represents the maximum number of addressable IDs
     (initial APIC ID) that can be assigned to logical processors in a
     physical package. The value may not be the same as the number of
     logical processors that are present in the hardware of a physical
     package.

It uses the word "may not".

However, the justification of the change cannot be "it leads to unexpected results in guest" because the guest implementation is not correct.

FYI, latest linux already fix the issue, it calculates the shift via

  tscan->ebx1_nproc_shift = get_count_order(ebx.nproc);


- EBX[23:16]: Maximum number of addressable IDs for logical processors
  in this physical package

- pow2ceil(EBX[23:16]): the number of unique initial APIC IDs reserved
  for addressing different logical processors in a physical package.

[1] https://cdrdv2-public.intel.com/759067/intel-64-architecture- processor-topology-enumeration.pdf

And on AMD platform:
CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
result meets our expectation.

So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer
only for Intel platform to solve the unexpected result.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
---
  target/i386/cpu.c | 10 +++++++++-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ff227a8c5c..641d4577b0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6462,7 +6462,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
          }
          *edx = env->features[FEAT_1_EDX];
          if (threads_per_pkg > 1) {
-            *ebx |= threads_per_pkg << 16;
+            /*
+             * AMD requires logical processor count, but Intel needs maximum +             * number of addressable IDs for logical processors per package.
+             */
+            if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
+                *ebx |= threads_per_pkg << 16;
+            } else {
+                *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16;
+            }
              *edx |= CPUID_HT;
          }
          if (!cpu->enable_pmu) {







reply via email to

[Prev in Thread] Current Thread [Next in Thread]