qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 10/10] migration: introduce snapshot-{save, load, delete}


From: Markus Armbruster
Subject: Re: [PATCH v5 10/10] migration: introduce snapshot-{save, load, delete} QMP commands
Date: Mon, 05 Oct 2020 09:26:54 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

Eric Blake <eblake@redhat.com> writes:

> On 10/2/20 11:27 AM, Daniel P. Berrangé wrote:
>> savevm, loadvm and delvm are some of the few HMP commands that have never
>> been converted to use QMP. The reasons for the lack of conversion are
>> that they blocked execution of the event thread, and the semantics
>> around choice of disks were ill-defined.
>> 
>> Despite this downside, however, libvirt and applications using libvirt
>> have used these commands for as long as QMP has existed, via the
>> "human-monitor-command" passthrough command. IOW, while it is clearly
>> desirable to be able to fix the problems, they are not a blocker to
>> all real world usage.
>> 
>> Meanwhile there is a need for other features which involve adding new
>> parameters to the commands. This is possible with HMP passthrough, but
>> it provides no reliable way for apps to introspect features, so using
>> QAPI modelling is highly desirable.
>> 
>> This patch thus introduces new snapshot-{load,save,delete} commands to
>> QMP that are intended to replace the old HMP counterparts. The new
>> commands are given different names, because they will be using the new
>> QEMU job framework and thus will have diverging behaviour from the HMP
>> originals. It would thus be misleading to keep the same name.
>> 
>> While this design uses the generic job framework, the current impl is
>> still blocking. The intention that the blocking problem is fixed later.
>> None the less applications using these new commands should assume that
>> they are asynchronous and thus wait for the job status change event to
>> indicate completion.
>> 
>> In addition to using the job framework, the new commands require the
>> caller to be explicit about all the block device nodes used in the
>> snapshot operations, with no built-in default heuristics in use.
>> 
>> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
>> ---
>
>> +++ b/qapi/job.json
>> @@ -22,10 +22,17 @@
>>  #
>>  # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1)
>>  #
>> +# @snapshot-load: snapshot load job type, see "snapshot-load" (since 5.2)
>> +#
>> +# @snapshot-save: snapshot save job type, see "snapshot-save" (since 5.2)
>> +#
>> +# @snapshot-delete: snapshot delete job type, see "snapshot-delete" (since 
>> 5.2)
>> +#
>>  # Since: 1.7
>>  ##
>>  { 'enum': 'JobType',
>> -  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] }
>> +  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend',
>> +           'snapshot-load', 'snapshot-save', 'snapshot-delete'] }
>>  
>>  ##
>>  # @JobStatus:
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index 7f5e6fd681..d2bd551ad9 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -1787,3 +1787,123 @@
>>  # Since: 5.2
>>  ##
>>  { 'command': 'query-dirty-rate', 'returns': 'DirtyRateInfo' }
>> +
>> +##
>> +# @snapshot-save:
>> +#
>> +# Save a VM snapshot
>> +#
>> +# @job-id: identifier for the newly created job
>> +# @tag: name of the snapshot to create
>> +# @devices: list of block device node names to save a snapshot to
>> +# @vmstate: block device node name to save vmstate to
>
> Here, you document vmstate last,...
>
>> +#
>> +# Applications should not assume that the snapshot save is complete
>> +# when this command returns. The job commands / events must be used
>> +# to determine completion and to fetch details of any errors that arise.
>> +#
>> +# Note that the VM CPUs will be paused during the time it takes to
>> +# save the snapshot

End the sentence with a period, please.

> "will be", or "may be"?  As you stated above, we may be able to lift the
> synchronous limitations down the road, while still maintaining the
> present interface of using this command to start the job and waiting on
> the job id until it is finished, at which point the CPUs might not need
> to be paused as much.

Could also add a sentence like "This may change in the future".

>> +#
>> +# It is strongly recommended that @devices contain all writable
>> +# block device nodes if a consistent snapshot is required.
>> +#
>> +# If @tag already exists, an error will be reported
>> +#
>> +# Returns: nothing
>> +#
>> +# Example:
>> +#
>> +# -> { "execute": "snapshot-save",
>> +#      "data": {
>> +#         "job-id": "snapsave0",
>> +#         "tag": "my-snap",
>> +#         "vmstate": "disk0",
>> +#         "devices": ["disk0", "disk1"]
>
> ...here vmstate occurs before devices.  I don't know if our doc
> generator cares about inconsistent ordering.

It does not.  It's questionable style, though.  Easy enough to tidy up.

>> +#      }
>> +#    }
>> +# <- { "return": { } }
>> +#
>> +# Since: 5.2
>> +##
>> +{ 'command': 'snapshot-save',
>> +  'data': { 'job-id': 'str',
>> +            'tag': 'str',
>> +            'vmstate': 'str',
>> +            'devices': ['str'] } }
>> +
>> +##
>> +# @snapshot-load:
>> +#
>> +# Load a VM snapshot
>> +#
>> +# @job-id: identifier for the newly created job
>> +# @tag: name of the snapshot to load.
>> +# @devices: list of block device node names to load a snapshot from
>> +# @vmstate: block device node name to load vmstate from
>> +#
>> +# Applications should not assume that the snapshot save is complete
>> +# when this command returns. The job commands / events must be used
>> +# to determine completion and to fetch details of any errors that arise.
>
> s/save/load/
>
>> +#
>> +# Note that the VM CPUs will be paused during the time it takes to
>> +# save the snapshot

End the sentence with a period, please.

> s/save/load/
>
> But while pausing CPUs during save is annoying, pausing CPUs during
> restore makes sense (after all, executing on stale data that will still
> be updated during the restore is just wasted execution).
>
>
>> +#
>> +# It is strongly recommended that @devices contain all writable
>> +# block device nodes that can have changed since the original
>> +# @snapshot-save command execution.
>> +#
>> +# Returns: nothing
>> +#
>> +# Example:
>> +#
>> +# -> { "execute": "snapshot-load",
>> +#      "data": {
>> +#         "job-id": "snapload0",
>> +#         "tag": "my-snap",
>> +#         "vmstate": "disk0",
>> +#         "devices": ["disk0", "disk1"]
>> +#      }
>> +#    }
>> +# <- { "return": { } }
>> +#
>> +# Since: 5.2
>> +##
>> +{ 'command': 'snapshot-load',
>> +  'data': { 'job-id': 'str',
>> +            'tag': 'str',
>> +            'vmstate': 'str',
>> +            'devices': ['str'] } }
>> +
>> +##
>> +# @snapshot-delete:
>> +#
>> +# Delete a VM snapshot
>> +#
>> +# @job-id: identifier for the newly created job
>> +# @tag: name of the snapshot to delete.
>> +# @devices: list of block device node names to delete a snapshot from
>> +#
>> +# Applications should not assume that the snapshot save is complete
>> +# when this command returns. The job commands / events must be used
>> +# to determine completion and to fetch details of any errors that arise.
>
> Do we have a query- command handy to easily learn which snapshot names
> are even available to attempt deletion on?  If not, that's worth a
> separate patch.

Oh, I missed that one.  It's the QMP equivalent to "info snapshots", and
it is required to finish the job.  Since we're at v5, I'd be okay with a
follow-up patch, as long as it is done for 5.2.

>> +#
>> +# Returns: nothing
>> +#
>> +# Example:
>> +#
>> +# -> { "execute": "snapshot-delete",
>> +#      "data": {
>> +#         "job-id": "snapdelete0",
>> +#         "tag": "my-snap",
>> +#         "devices": ["disk0", "disk1"]
>> +#      }
>> +#    }
>> +# <- { "return": { } }
>> +#
>> +# Since: 5.2
>> +##

With the doc comment nits addressed, and with new command
query-snapshots either included or promised for later in 5.2:
Acked-by: Markus Armbruster <armbru@redhat.com>

>
>> +++ b/tests/qemu-iotests/group
>> @@ -291,6 +291,7 @@
>>  277 rw quick
>>  279 rw backing quick
>>  280 rw migration quick
>> +310 rw quick
>>  281 rw quick
>>  282 rw img quick
>>  283 auto quick
>
> What's wrong with sorted order? I get the renumbering to appease a merge
> conflict, but it also requires rearrangement ;)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]