|
From: | Xiaoyao Li |
Subject: | Re: [PATCH v6] i386/cpu: fixup number of addressable IDs for logical processors in the physical package |
Date: | Mon, 14 Oct 2024 08:36:13 +0800 |
User-agent: | Mozilla Thunderbird |
On 10/12/2024 5:35 PM, Chuang Xu wrote:
On 10/12/24 下午4:21, Xiaoyao Li wrote:On 10/9/2024 11:56 AM, Chuang Xu wrote:When QEMU is started with: -cpu host,migratable=on,host-cache-info=on,l3-cache=off -smp 180,sockets=2,dies=1,cores=45,threads=2 On Intel platform: CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for logical processors in the physical package".When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 forCPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally, executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for CPUID.04H.EAX[31:26], which matches the expected result. As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer, we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer too. Otherwise we may encounter unexpected results in guest.For example, when QEMU is started with CLI above and xtopology is disabled, guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/ (1+CPUID.04H.EAX[31:26]) to calculate threads-per-core in detect_ht(). Then guest will get "90/ (1+63)=1"as the result, even though threads-per-core should actually be 2.It's kernel's bug instead.In 1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel 64 Architecture Processor Topology Enumeration" [1], it is- SMT_Mask_Width = Log2(RoundToNearestPof2(CPUID.1:EBX[23:16])/ (CPUID.(EAX=4,ECX=0):EAX[31:26]) + 1))The value of CPUID.1:EBX[23:16] needs to be *rounded* to the neartest power-of-two integer instead of itself being the power-of-two.This also is consistency with the SDM, where the comment for bit 23-16 of CPUID.1:EBX is:The nearest power-of-2 integer that is not smaller than EBX[23:16] is the number of unique initial APIC IDs reserved for addressing different logical processors in a physical package.What I read from this is, the nearest power-of-2 integer that is not smaller than EBX[23:16] is a different thing than EBX[23:16]. i.e.,Yes, when I read sdm, I also thought it was a kernel bug. But on my 192ht spr host, the value of CPUID.1:EBX[23:16] was indeed rounded upto the nearest power of 2 by the hardware. After communicating with Intel technical support staff, we thought that perhaps the description in sdmis not so accurate, and rounding up CPUID.1:EBX[23:16] to the power of 2 in qemu may be more in line with the hardware behavior.
I think above justification is important. We need to justify our changes with the fact and correct reason.
I somehow agree to set EBX[23:16] to a value of power-of-2, because the 1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel 64 Architecture Processor Topology Enumeration" spec says
CPUID.1:EBX[23:16] represents the maximum number of addressable IDs (initial APIC ID) that can be assigned to logical processors in a physical package. The value may not be the same as the number of logical processors that are present in the hardware of a physical package. It uses the word "may not".However, the justification of the change cannot be "it leads to unexpected results in guest" because the guest implementation is not correct.
- EBX[23:16]: Maximum number of addressable IDs for logical processors in this physical package - pow2ceil(EBX[23:16]): the number of unique initial APIC IDs reserved for addressing different logical processors in a physical package.[1] https://cdrdv2-public.intel.com/759067/intel-64-architecture- processor-topology-enumeration.pdfAnd on AMD platform: CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current result meets our expectation.So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integeronly for Intel platform to solve the unexpected result. Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com> --- target/i386/cpu.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index ff227a8c5c..641d4577b0 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c@@ -6462,7 +6462,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,} *edx = env->features[FEAT_1_EDX]; if (threads_per_pkg > 1) { - *ebx |= threads_per_pkg << 16; + /*+ * AMD requires logical processor count, but Intel needs maximum + * number of addressable IDs for logical processors per package.+ */ + if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) { + *ebx |= threads_per_pkg << 16; + } else { + *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16; + } *edx |= CPUID_HT; } if (!cpu->enable_pmu) {
[Prev in Thread] | Current Thread | [Next in Thread] |