|
From: | Xiaoyao Li |
Subject: | Re: [PATCH v6] i386/cpu: fixup number of addressable IDs for logical processors in the physical package |
Date: | Sat, 12 Oct 2024 16:32:15 +0800 |
User-agent: | Mozilla Thunderbird |
On 10/12/2024 4:10 PM, Chuang Xu wrote:
Hi, Xiaoyao On 10/12/24 下午3:13, Xiaoyao Li wrote:For amd platform, CPUID.04H is reserved, so it uses CPUID.8000001E.EAX[15:8] (fied ThreadsPerComputeUnit) to obtain the result.On 10/9/2024 11:56 AM, Chuang Xu wrote:When QEMU is started with: -cpu host,migratable=on,host-cache-info=on,l3-cache=off -smp 180,sockets=2,dies=1,cores=45,threads=2 On Intel platform: CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for logical processors in the physical package".When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 forCPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally, executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for CPUID.04H.EAX[31:26], which matches the expected result. As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer, we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer too. Otherwise we may encounter unexpected results in guest.For example, when QEMU is started with CLI above and xtopology is disabled, guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/ (1+CPUID.04H.EAX[31:26]) to calculate threads-per-core in detect_ht(). Then guest will get "90/ (1+63)=1"as the result, even though threads-per-core should actually be 2. And on AMD platform: CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current result meets our expectation.So for AMD platform, what's result for the same situation with xtopology disabled? Does AMD uses another algorithm to calculate other than CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) ?
Does AMD support leaf 8000001E at the beginning when it starts to support multi-threads/multi-cores? (just my curiosity)
It seems other cpuid cases of bit shifting don't condiser the overflow case too..So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integeronly for Intel platform to solve the unexpected result. Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com> --- target/i386/cpu.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index ff227a8c5c..641d4577b0 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c@@ -6462,7 +6462,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,} *edx = env->features[FEAT_1_EDX]; if (threads_per_pkg > 1) { - *ebx |= threads_per_pkg << 16; + /*+ * AMD requires logical processor count, but Intel needs maximum + * number of addressable IDs for logical processors per package.+ */ + if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) { + *ebx |= threads_per_pkg << 16; + } else { + *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16; + }you need to handle the overflow case when the number of logical processors > 255.Since intel only reserves 8bits for this field, do you have any suggestions to make sure this field emulatedcorrectly?
the usual option can be masking the value to only 8 bits before shifting, like
((1 << apicid_pkg_offset(&topo_info)) & 0xff) << 16but when the value is greater than 255, it will be truncated, so we need something like below to reflect the hardware behavior:
MIN((1 << apicid_pkg_offset(&topo_info)), 255) << 16This is what Qian's patch [1] wanted to fix last year, but that patch never gets merged.
[1] https://lore.kernel.org/qemu-devel/20230829042405.932523-2-qian.wen@intel.com/
*edx |= CPUID_HT; } if (!cpu->enable_pmu) {
[Prev in Thread] | Current Thread | [Next in Thread] |