Re: [Qemu-devel] [PATCH v2 3/3] migration: introduce adaptive model for

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 3/3] migration: introduce adaptive model for

From:	Xiao Guangrong
Subject:	Re: [Qemu-devel] [PATCH v2 3/3] migration: introduce adaptive model for waiting thread
Date:	Mon, 18 Feb 2019 17:01:09 +0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1



On 1/16/19 2:40 PM, Peter Xu wrote:

On Fri, Jan 11, 2019 at 02:37:32PM +0800, address@hidden wrote:

+
+static void update_compress_wait_thread(MigrationState *s)
+{
+    s->compress_wait_thread = get_compress_wait_thread(&s->parameters);
+    assert(s->compress_wait_thread != COMPRESS_WAIT_THREAD_ERR);
+}


We can probably deprecate these chunk of codes if you're going to use
alternative structs or enum as suggested by Markus...


Yes, indeed.

I think Libvirt is not using this parameter, right?  And I believe the
parameter "compress-wait-thread" was just introduced since QEMU 3.1.
I'm not sure whether we can directly change it to an enum assuming
that no one is really using it in production yet which could possibly
break nobody.


I did a check in libvirt's code:
$ git grep compress-wait-thread
tests/qemucapabilitiesdata/caps_3.1.0.ppc64.replies:          "name": 
"compress-wait-thread",
tests/qemucapabilitiesdata/caps_3.1.0.ppc64.replies:          "name": 
"compress-wait-thread",
tests/qemucapabilitiesdata/caps_3.1.0.x86_64.replies:          "name": 
"compress-wait-thread",
tests/qemucapabilitiesdata/caps_3.1.0.x86_64.replies:          "name": 
"compress-wait-thread",
tests/qemucapabilitiesdata/caps_4.0.0.x86_64.replies:          "name": 
"compress-wait-thread",
tests/qemucapabilitiesdata/caps_4.0.0.x86_64.replies:          "name": 
"compress-wait-thread",

It seems changing this parameter will break libvirt's self-test?


Maybe we still have chance to quickly switch back to the name
"x-compress-wait-thread" just like the -global interface then we don't
need to worry much on QAPI breakage so far until the parameter proves
itself to remove the "x-".


Er... i am not sure i can follow it clearly. :(

[...]
@@ -1917,40 +2000,40 @@ bool migrate_postcopy_blocktime(void)
      return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_BLOCKTIME];
  }
-bool migrate_use_compression(void)
+int64_t migrate_max_bandwidth(void)
  {
      MigrationState *s;
s = migrate_get_current();- return s->enabled_capabilities[MIGRATION_CAPABILITY_COMPRESS];
+    return s->parameters.max_bandwidth;
  }
-int migrate_compress_level(void)
+bool migrate_use_compression(void)
  {
      MigrationState *s;
s = migrate_get_current();- return s->parameters.compress_level;
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_COMPRESS];
  }
-int migrate_compress_threads(void)
+int migrate_compress_level(void)
  {
      MigrationState *s;
s = migrate_get_current();- return s->parameters.compress_threads;
+    return s->parameters.compress_level;
  }
-int migrate_compress_wait_thread(void)
+int migrate_compress_threads(void)
  {
      MigrationState *s;
s = migrate_get_current();- return s->parameters.compress_wait_thread;
+    return s->parameters.compress_threads;
I'm a bit confused on these diff... are you only adding
migrate_max_bandwidth() and not changing anything else?


I guess so.

 I'm curious
on how these chunk is generated since it looks really weird...


Looks weird for me too. :(


[...]

  /* State of RAM for migration */
  struct RAMState {
      /* QEMUFile used for this migration */
@@ -292,6 +294,19 @@ struct RAMState {
      bool ram_bulk_stage;
      /* How many times we have dirty too many pages */
      int dirty_rate_high_cnt;
+
+    /* used by by compress-wait-thread-adaptive */


compress-wait-thread-adaptive is gone?


It's a typo, will fix.

+
+    max_bw = (max_bw >> 20) * 8;
+    remain_bw = abs(max_bw - (int64_t)(mbps));
+    if (remain_bw <= (max_bw / 20)) {
+        /* if we have used all the bandwidth, let's compress more. */
+        if (compression_counters.compress_no_wait_weight) {
+            compression_counters.compress_no_wait_weight--;
+        }
+        goto exit;
+    }
+
+    /* have enough bandwidth left, do not need to compress so aggressively */
+    if (compression_counters.compress_no_wait_weight !=
+        COMPRESS_BUSY_COUNT_PERIOD) {
+        compression_counters.compress_no_wait_weight++;
+    }
+
+exit:
+    ram_state->compress_busy_count = 0;
+    ram_state->compress_no_wait_left =
+                            compression_counters.compress_no_wait_weight;


The "goto" and the chunk seems awkward to me...  How about this?

   if (not_enough_remain_bw && weight)
     weight--;
   else if (weight <= MAX)
     weight++;

   (do the rest...)


Okay, will address your style.

Also, would you like to add some documentation to these compression
features into docs/devel/migration.rst?  It'll be good, but it's your
call. :)


It's useful indeed. I will do it.

  static void migration_update_rates(RAMState *rs, int64_t end_time)
  {
      uint64_t page_count = rs->target_page_count - rs->target_page_count_prev;
@@ -1975,7 +2057,11 @@ static int compress_page_with_multi_thread(RAMState *rs, 
RAMBlock *block,
                                             ram_addr_t offset)
  {
      int idx, thread_count, bytes_xmit = -1, pages = -1;
-    bool wait = migrate_compress_wait_thread();
+    int compress_wait_thread = migrate_compress_wait_thread();
+    bool wait, adaptive;
+
+    wait = (adaptive == COMPRESS_WAIT_THREAD_ON);
+    adaptive = (adaptive == COMPRESS_WAIT_THREAD_ADAPTIVE);


Should s/adaptive/compress_wait_thread/ for both lines on the right?


Oops! I made the patches based on the wrong code base. :(

It seems that you'll probably want to update the performance numbers
too in the next post since current test number might depend on a
random stack variable. :)


Yes, indeed, will update the number.

thread_count = migrate_compress_threads();

      qemu_mutex_lock(&comp_done_lock);
@@ -1990,20 +2076,29 @@ retry:
              qemu_mutex_unlock(&comp_param[idx].mutex);
              pages = 1;
              update_compress_thread_counts(&comp_param[idx], bytes_xmit);
-            break;
+            goto exit;
          }
      }

+ if (adaptive && !wait) {

+        /* it is the first time go to the loop */
+        wait = compress_adaptive_need_wait();
+    }


IIUC if adaptive==true then wait must be false.

I would really make this simpler like:

   if (compress_wait_thread == ON)
     wait = on;
   else if (compress_wait_thread == OFF)
     wait = off;
   else
     wait = compress_adaptive_need_wait();


I am afraid it is not the one we want. :(

We do not always go to compress_adaptive_need_wait() for 'adaptive',
instead, we do it only for the first time in the loop:

    if (adaptive && !wait) {
        /* it is the first time go to the loop */
        wait = compress_adaptive_need_wait();
    }

Thanks!

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH v2 3/3] migration: introduce adaptive model for waiting thread, Xiao Guangrong <=
- Re: [Qemu-devel] [PATCH v2 3/3] migration: introduce adaptive model for waiting thread, Xiao Guangrong, 2019/02/18

Prev by Date: Re: [Qemu-devel] [PATCH v6 12/18] hw/arm/boot: Expose the PC-DIMM nodes in the DT
Next by Date: Re: [Qemu-devel] [PATCH 02/12] target/ppci/mmu: Use LPCR:HR to chose radix vs. hash translation
Previous by thread: Re: [Qemu-devel] [PATCH v2 2/3] migration: fix memory leak when updating tls-creds and tls-hostname
Next by thread: Re: [Qemu-devel] [PATCH v2 3/3] migration: introduce adaptive model for waiting thread
Index(es):
- Date
- Thread