qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC] migration: Remove old compression code


From: Markus Armbruster
Subject: Re: [RFC] migration: Remove old compression code
Date: Sat, 25 Jan 2020 09:42:04 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Cc'ing like David did, plus Eric and Mike for additional QAPI expertise.

Juan Quintela <address@hidden> writes:

> Hi
>
> There are several problems with the old compression code:
> * it compress one page at a time (i.e. 4KB)
>
> * it has to copy all the page to the compression thread, the
>   compression thread compress it, it copies back the compressed data
>   to the main thread and it sends it there (equivalent for decompress).
>
> * It is really slow (both the setup of it, and the compression of each page)
>
> You can see that in the previous series, I added compression to multifd, its 
> advantages are:
> * We compress "chunks" of 128 pages at a time
> * We don't copy anything, we just compress everything that is sent through 
> the channel
> * We don't send data over the main channel, so we have less overhead
> * I added zstd support, that
>   o  Compress much more
>   o  It is way faster
>
> * Using the migration-test values (best option for compressors, as we
>   only change one byte on each page):
>   * zstd is around the same latency than no compression (i.e. it takes the 
> same time)
>   * zlib needs 2500 bytes to send 128 pages compressed (remember 1
>     byte changed in each page, rest are zeros)
>   * zstd needs 52 bytes for the same packets (yes, I checked several
>     times, it appears that its use of dictionaries is good here)
>
> This is an RFC, you will ask why?
> There are several questions:
> * How to proceed to remove this?
>    - Deprecate it into the future?
>      Has the advantage that customers can continue using it.
>      But it is not so stable to use it.
>    - Remove it now?
>      Has the advantage of much less code, but ....
> * Statistics:
>   Several of the statistics are already only printed when compression is
>   being used. My understanding is that we ase safe removing this ones.
> * But whan about the migration_parameters?  Should we leave them
>   alone, and just don't care about them? Should we remove them?
> * What to do with migration.json changes?  I will apprecciate comments here.
>
> Thanks, Juan.
>
> PD.  You can see from the previous series that putting compression on
>      top of multifd + splitting the code in different files + making it 
> optional:
>      around 1040 new lines of code.
>      Removing this ~ 960 lines of code.
>
> Signed-off-by: Juan Quintela <address@hidden>
[...]

Let's have a look at the changes to the stable external interface.

> diff --git a/qapi/migration.json b/qapi/migration.json
> index 65db85970e..5477d4d20b 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -79,27 +79,6 @@
>             'cache-miss': 'int', 'cache-miss-rate': 'number',
>             'overflow': 'int' } }
>  
> -##
> -# @CompressionStats:
> -#
> -# Detailed migration compression statistics
> -#
> -# @pages: amount of pages compressed and transferred to the target VM
> -#
> -# @busy: count of times that no free thread was available to compress data
> -#
> -# @busy-rate: rate of thread busy
> -#
> -# @compressed-size: amount of bytes after compression
> -#
> -# @compression-rate: rate of compressed size
> -#
> -# Since: 3.1
> -##
> -{ 'struct': 'CompressionStats',
> -  'data': {'pages': 'int', 'busy': 'int', 'busy-rate': 'number',
> -        'compressed-size': 'int', 'compression-rate': 'number' } }
> -

Only user is MigrationInfo, which is next.

>  ##
>  # @MigrationStatus:
>  #
> @@ -200,9 +179,6 @@
   ##
   # @MigrationInfo:
[...]
>  #           only present when the postcopy-blocktime migration capability
>  #           is enabled. (Since 3.0)
>  #
> -# @compression: migration compression statistics, only returned if 
> compression
> -#           feature is on and status is 'active' or 'completed' (Since 3.1)
> -#
>  # @socket-address: Only used for tcp, to know what the real port is (Since 
> 4.0)
>  #
>  # Since: 0.14.0
> @@ -219,7 +195,6 @@
>             '*error-desc': 'str',
>             '*postcopy-blocktime' : 'uint32',
>             '*postcopy-vcpu-blocktime': ['uint32'],
> -           '*compression': 'CompressionStats',
>             '*socket-address': ['SocketAddress'] } }

MigrationInfo is returned by  query-migrate.  No other users.

Doc comment gives us wiggle room: "only returned if compression feature
is on".  Can it be on after this patch?

If no, the member can go without breaking query-migrate compatibility.

query-qmp-schema shows the change, though.  Could conceivably affect
users, but it seems unlikely.

Eric, Mike, we might want to grant ourselves explicit license to change
query-qmp-schema in such ways, by having qapi-code-gen.txt state
optional members may be removed when they can't be present anymore.
What do you think?

Alternatively, keep the member, hardcode value to whatever is returned
before the patch when compression is off, deprecate, remove after grace
period.  A bit more work, but safer.  Worthwhile?  Not for me to decide.

>  
>  ##
> @@ -364,14 +339,6 @@
   ##
   # @MigrationCapability:
[...]
>  #          to enable the capability on the source VM. The feature is 
> disabled by
>  #          default. (since 1.6)
>  #
> -# @compress: Use multiple compression threads to accelerate live migration.
> -#          This feature can help to reduce the migration traffic, by sending
> -#          compressed pages. Please note that if compress and xbzrle are both
> -#          on, compress only takes effect in the ram bulk stage, after that,
> -#          it will be disabled and only xbzrle takes effect, this can help to
> -#          minimize migration traffic. The feature is disabled by default.
> -#          (since 2.4 )
> -#
>  # @events: generate events for each migration state change
>  #          (since 2.4 )
>  #
> @@ -425,7 +392,7 @@
>  ##
>  { 'enum': 'MigrationCapability',
>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
> -           'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
> +           'events', 'postcopy-ram', 'x-colo', 'release-ram',
>             'block', 'return-path', 'pause-before-switchover', 'multifd',
>             'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
>             'x-ignore-shared', 'validate-uuid' ] }

Only users are migrate-set-capabilities and query-migrate-capabilities,
via MigrationCapabilityStatus, which are next.

> @@ -479,7 +446,6 @@
   ##
   # @MigrationCapabilityStatus:
   #
   # Migration capability information
   #
   # @capability: capability enum
   #
   # @state: capability state bool
   #
   # Since: 1.2
   ##
   { 'struct': 'MigrationCapabilityStatus',
     'data': { 'capability' : 'MigrationCapability', 'state' : 'bool' } }

   ##
   # @migrate-set-capabilities:
   #
   # Enable/Disable the following migration capabilities (like xbzrle)
   #
   # @capabilities: json array of capability modifications to make
   #
   # Since: 1.2
   #
   # Example:
   #
   # -> { "execute": "migrate-set-capabilities" , "arguments":
   #      { "capabilities": [ { "capability": "xbzrle", "state": true } ] } }
   #
   ##
   { 'command': 'migrate-set-capabilities',
     'data': { 'capabilities': ['MigrationCapabilityStatus'] } }

With MigrationCapability @compress gone, attempting to set it with
migrate-set-capabilities fails with GenericError, "desc": "Invalid
parameter 'compress'".

Setting capability "compress" can fail before this patch, but only when
you do it in the middle of a migration, so this is actually a new error.

Adding errors is explicitly permitted in docs/interop/qmp-spec.txt
section "Compatibility Considerations":

    However, Clients must not assume any particular:
    [...]
    - Amount of errors generated by a command, that is, new errors can be added
      to any existing command in newer versions of the Server

We're good.

   ##
   # @query-migrate-capabilities:
   #
   # Returns information about the current migration capabilities status
   #
   # Returns: @MigrationCapabilitiesStatus
   #
   # Since: 1.2
   #
   # Example:
   #
   # -> { "execute": "query-migrate-capabilities" }
   # <- { "return": [
   #       {"state": false, "capability": "xbzrle"},
>  #       {"state": false, "capability": "rdma-pin-all"},
>  #       {"state": false, "capability": "auto-converge"},
>  #       {"state": false, "capability": "zero-blocks"},
> -#       {"state": false, "capability": "compress"},
>  #       {"state": true, "capability": "events"},
>  #       {"state": false, "capability": "postcopy-ram"},
>  #       {"state": false, "capability": "x-colo"}
   #    ]}
   #
   ##
   { 'command': 'query-migrate-capabilities', 'returns':   
['MigrationCapabilityStatus']}

What capabilties are returned when is not specified.

Aside: it should be, shouldn't it?

qmp_query_migrate_capabilities() returns them all, except it skips
"block" when !defined CONFIG_LIVE_BLOCK_MIGRATION.  De facto ABI due to
the gap in the documented ABI.

Removing capability "compress" is a break of this de facto ABI.
Acceptable when we're confident the ABI's users don't care.

Alternatively, keep them, hardcode value to false, deprecate, remove
after grace period.

Aside: you might want to make MigrationCapability @block conditional like

   { 'enum': 'MigrationCapability',
     'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
              'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
              { 'name': 'block',
                'if': 'defined(CONFIG_LIVE_BLOCK_MIGRATION)' },
              'return-path', 'pause-before-switchover', 'multifd',
              'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
              'x-ignore-shared', 'validate-uuid' ] }

> @@ -523,20 +489,6 @@
   ##
   # @MigrationParameter:
[...]
>  #          no compression, 1 means the best compression speed, and 9 means 
> best
>  #          compression ratio which will consume more CPU.
>  #
> -# @compress-threads: Set compression thread count to be used in live 
> migration,
> -#          the compression thread count is an integer between 1 and 255.
> -#
> -# @compress-wait-thread: Controls behavior when all compression threads are
> -#                        currently busy. If true (default), wait for a free
> -#                        compression thread to become available; otherwise,
> -#                        send the page uncompressed. (Since 3.1)
> -#
> -# @decompress-threads: Set decompression thread count to be used in live
> -#          migration, the decompression thread count is an integer between 1
> -#          and 255. Usually, decompression is at least 4 times as fast as
> -#          compression, so set the decompress-threads to the number about 1/4
> -#          of compress-threads is adequate.
> -#
>  # @cpu-throttle-initial: Initial percentage of time guest cpus are throttled
>  #                        when migration auto-converge is activated. The
>  #                        default value is 20. (Since 2.7)
> @@ -607,8 +559,7 @@
>  { 'enum': 'MigrationParameter',
>    'data': ['announce-initial', 'announce-max',
>             'announce-rounds', 'announce-step',
> -           'compress-level', 'compress-threads', 'decompress-threads',
> -           'compress-wait-thread',
> +           'compress-level',
>             'cpu-throttle-initial', 'cpu-throttle-increment',
>             'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
>             'downtime-limit', 'x-checkpoint-delay', 'block-incremental',

Used internally; not part of the external interface.  Okay.

> @@ -633,15 +584,6 @@
   ##
   # @MigrateSetParameters:
[...]
>  #
>  # @compress-level: compression level
>  #
> -# @compress-threads: compression thread count
> -#
> -# @compress-wait-thread: Controls behavior when all compression threads are
> -#                        currently busy. If true (default), wait for a free
> -#                        compression thread to become available; otherwise,
> -#                        send the page uncompressed. (Since 3.1)
> -#
> -# @decompress-threads: decompression thread count
> -#
>  # @cpu-throttle-initial: Initial percentage of time guest cpus are
>  #                        throttled when migration auto-converge is activated.
>  #                        The default value is 20. (Since 2.7)
> @@ -717,9 +659,6 @@
>              '*announce-rounds': 'size',
>              '*announce-step': 'size',
>              '*compress-level': 'int',
> -            '*compress-threads': 'int',
> -            '*compress-wait-thread': 'bool',
> -            '*decompress-threads': 'int',
>              '*cpu-throttle-initial': 'int',
>              '*cpu-throttle-increment': 'int',
>              '*tls-creds': 'StrOrNull',

MigrateSetParameters is the (boxed) argument of migrate-set-parameters.
No other users.

With these optional members, attempting to use them with
migrate-set-parameters fails with GenericError, "desc": "Invalid
parameter 'NAME'".

Same argument as for migrate-set-capabilities above.  Good.

> @@ -770,15 +709,6 @@
   ##
   # @MigrationParameters:
   #
   # The optional members aren't actually optional.
[...]
>  # @compress-level: compression level
>  #
> -# @compress-threads: compression thread count
> -#
> -# @compress-wait-thread: Controls behavior when all compression threads are
> -#                        currently busy. If true (default), wait for a free
> -#                        compression thread to become available; otherwise,
> -#                        send the page uncompressed. (Since 3.1)
> -#
> -# @decompress-threads: decompression thread count
> -#
>  # @cpu-throttle-initial: Initial percentage of time guest cpus are
>  #                        throttled when migration auto-converge is activated.
>  #                        (Since 2.7)
> @@ -854,9 +784,6 @@
>              '*announce-rounds': 'size',
>              '*announce-step': 'size',
>              '*compress-level': 'uint8',
> -            '*compress-threads': 'uint8',
> -            '*compress-wait-thread': 'bool',
> -            '*decompress-threads': 'uint8',
>              '*cpu-throttle-initial': 'uint8',
>              '*cpu-throttle-increment': 'uint8',
>              '*tls-creds': 'str',
> @@ -885,9 +812,7 @@
>  #
>  # -> { "execute": "query-migrate-parameters" }
>  # <- { "return": {
> -#          "decompress-threads": 2,
>  #          "cpu-throttle-increment": 10,
> -#          "compress-threads": 8,
>  #          "compress-level": 1,
>  #          "cpu-throttle-initial": 20,
>  #          "max-bandwidth": 33554432,

MigrateSetParameters is returned by query-migrate-parameters.  No other
users.

Even thought the members are optional, the doc comment specifies all are
returned.  qmp_query_migrate_parameters() does.

Removing the parameters breaks this promise.  Acceptable when we're
confident the ABI's users don't care.

Alternatively, keep them, hardcode value to whatever is returned before
the patch when compression is off, deprecate, remove after grace period.

Questions?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]