qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 3/7] hw/acpi/aml-build: Improve scalability of PPTT genera


From: wangyanan (Y)
Subject: Re: [PATCH v6 3/7] hw/acpi/aml-build: Improve scalability of PPTT generation
Date: Tue, 4 Jan 2022 10:05:43 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0

Hi Drew,
Thanks for your review.
On 2022/1/3 19:24, Andrew Jones wrote:
On Mon, Jan 03, 2022 at 04:46:32PM +0800, Yanan Wang wrote:
Currently we generate a PPTT table of n-level processor hierarchy
with n-level loops in build_pptt(). It works fine as now there are
only three CPU topology parameters. But the code may become less
scalable with the processor hierarchy levels increasing.

This patch only improves the scalability of build_pptt by reducing
the loops, and intends to make no functional change.

Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
---
  hw/acpi/aml-build.c | 50 +++++++++++++++++++++++++++++----------------
  1 file changed, 32 insertions(+), 18 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index b3b3310df3..be3851be36 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2001,7 +2001,10 @@ static void build_processor_hierarchy_node(GArray *tbl, 
uint32_t flags,
  void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
                  const char *oem_id, const char *oem_table_id)
  {
-    int pptt_start = table_data->len;
+    GQueue *list = g_queue_new();
+    guint pptt_start = table_data->len;
+    guint father_offset;
"parent_offset" would be more conventional.
Apparently... I will rename it.
+    guint length, i;
      int uid = 0;
      int socket;
      AcpiTable table = { .sig = "PPTT", .rev = 2,
@@ -2010,9 +2013,8 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, 
MachineState *ms,
      acpi_table_begin(&table, table_data);
for (socket = 0; socket < ms->smp.sockets; socket++) {
-        uint32_t socket_offset = table_data->len - pptt_start;
-        int core;
-
+        g_queue_push_tail(list,
+            GUINT_TO_POINTER(table_data->len - pptt_start));
          build_processor_hierarchy_node(
              table_data,
              /*
@@ -2021,35 +2023,47 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, 
MachineState *ms,
               */
              (1 << 0),
              0, socket, NULL, 0);
+    }
- for (core = 0; core < ms->smp.cores; core++) {
-            uint32_t core_offset = table_data->len - pptt_start;
-            int thread;
+    length = g_queue_get_length(list);
+    for (i = 0; i < length; i++) {
+        int core;
+ father_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
+        for (core = 0; core < ms->smp.cores; core++) {
              if (ms->smp.threads > 1) {
+                g_queue_push_tail(list,
+                    GUINT_TO_POINTER(table_data->len - pptt_start));
                  build_processor_hierarchy_node(
                      table_data,
                      (0 << 0), /* not a physical package */
-                    socket_offset, core, NULL, 0);
-
-                for (thread = 0; thread < ms->smp.threads; thread++) {
-                    build_processor_hierarchy_node(
-                        table_data,
-                        (1 << 1) | /* ACPI Processor ID valid */
-                        (1 << 2) | /* Processor is a Thread */
-                        (1 << 3),  /* Node is a Leaf */
-                        core_offset, uid++, NULL, 0);
-                }
+                    father_offset, core, NULL, 0);
              } else {
                  build_processor_hierarchy_node(
                      table_data,
                      (1 << 1) | /* ACPI Processor ID valid */
                      (1 << 3),  /* Node is a Leaf */
-                    socket_offset, uid++, NULL, 0);
+                    father_offset, uid++, NULL, 0);
              }
          }
      }
+ length = g_queue_get_length(list);
+    for (i = 0; i < length; i++) {
+        int thread;
+
+        father_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
+        for (thread = 0; thread < ms->smp.threads; thread++) {
+            build_processor_hierarchy_node(
+                table_data,
+                (1 << 1) | /* ACPI Processor ID valid */
+                (1 << 2) | /* Processor is a Thread */
+                (1 << 3),  /* Node is a Leaf */
+                father_offset, uid++, NULL, 0);
+        }
+    }
+
+    g_queue_free(list);
      acpi_table_end(linker, &table);
  }
This patch actually increases the number of loops, since we need to visit
higher hierarchical nodes twice (once to enqueue and once to dequeue).
Yes, we actually need to access the higher hierarchical node's offset twice.
But that may not be a problem since numbers of topology parameters are
not so huge that we need to consider the performance.
We
do reduce code indentation and it looks like we could more easily skip
hierarchy levels we don't want, though.
Yes, just as you said. The commit message doesn't describe the motivation
well. This patch aims to reduce the increasing code indentation because of
increasing nested loops, and consequently it's a bit easier to extend with
new topology level.
While my impulse is to say we
should just keep this simple and add another nested loop for clusters, I
guess I'm OK with this too.
Thank you!
Reviewed-by: Andrew Jones <drjones@redhat.com>

Thanks,
Yanan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]