qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v2 2/2] docs/interop/bitmaps: rewrite and modern


From: John Snow
Subject: Re: [Qemu-block] [PATCH v2 2/2] docs/interop/bitmaps: rewrite and modernize doc
Date: Tue, 23 Apr 2019 14:03:00 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1


On 4/23/19 1:58 PM, Vladimir Sementsov-Ogievskiy wrote:
> 23.04.2019 1:17, John Snow wrote:
>> This just about rewrites the entirety of the bitmaps.rst document to
>> make it consistent with the 4.0 release. I have added new features seen
>> in the 4.0 release, as well as tried to clarify some points that keep
>> coming up when discussing this feature both in-house and upstream.
>>
>> Yes, it's a lot longer, mostly due to examples. I get a bit chatty.
>> I could use a good editor to help reign in my chattiness.
>>
>> It does not yet cover pull backups or migration details, but I intend to
>> keep extending this document to cover those cases.
>>
>> Please try compiling it with sphinx and look at the rendered output, I
>> don't have hosting to share my copy at present. I think this new layout
>> reads nicer in the HTML format than the old one did, at the expense of
>> looking less readable in the source tree itself (though not completely
>> unmanagable. We did decide to convert it from Markdown to ReST, after
>> all, so I am going all-in on ReST.)
>>
>> Signed-off-by: John Snow <address@hidden>
>> ---
> 
> [..]
> 
>> +Bitmap Status
>> +-------------
>> +
>> +Dirty Bitmap objects can be queried with the QMP command `query-block
>> +<qemu-qmp-ref.html#index-query_002dblock>`_, and are visible via the
>> +`BlockDirtyInfo <qemu-qmp-ref.html#index-BlockDirtyInfo>`_ QAPI structure.
>> +
>> +This struct shows the name, granularity, and dirty byte count for each 
>> bitmap.
>> +Additionally, it shows several boolean status indicators:
>> +
>> +- ``recording``: This bitmap is recording writes.
>> +- ``busy``: This bitmap is in-use by an operation.
>> +- ``persistent``: This bitmap is a persistent type.
>> +- ``inconsistent``: This bitmap is corrupted and cannot be used.
>> +
>> +The ``+busy`` status prohibits you from deleting, clearing, or otherwise
>> +modifying a bitmap, and happens when the bitmap is being used for a backup
>> +operation or is in the process of being loaded from a migration. Many of the
>> +commands documented below will refuse to work on such bitmaps.
>> +
>> +The ``+inconsistent`` status similarly prohibits almost all operations,
>> +notably allowing only the ``block-dirty-bitmap-remove`` operation.
>> +
>> +There is also a deprecated "`DirtyBitmapStatus
>> +<qemu-qmp-ref.html#index-DirtyBitmapStatus>`_" field. a bitmap historically
> 
> I think, better s/DirtyBitmapStatus/status/, to use field name, not type for 
> what
> to call "field".
> 
> 
>> +had five visible states:
>> +
>> +   #. ``Frozen``: This bitmap is currently in-use by an operation and is
>> +      immutable. It can't be deleted, renamed, reset, etc.
>> +
>> +      (This is now ``+busy``.)
>> +
>> +   #. ``Disabled``: This bitmap is not recording new writes.
>> +
> 
> [..]
> 
>> +Example: Incremental Push Backups without Backing Files
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +Backup images are best kept off-site, so we often will not have the 
>> preceding
>> +backups in a chain available to link against. This is not a problem at 
>> backup
>> +time; we simply do not set the backing image when creating the destination
>> +image:
>> +
>> +#. Create a new destination image with no backing file set. We will need to
>> +   specify the size of the base image this time, because it isn't available
>> +   for QEMU to use to guess:
> 
> hmm, s/to use to guess/to guess/ ?
> 

Good point.

>>   
>>      .. code:: bash
>>   
> 
> [..]
> 
>> +Example: Individual Failures
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +Backup jobs that fail individually behave simply as described above. This
>> +example shows the simplest case:
> 
> Hmm, I think the simplest would be full backup without mode=existing :),
> so suggest just s/This example.../Consider example below:/, or 
> s/Backup/Incremental backup/
> 
>> +
>> +#. Create a target image:
>> +
>> +   .. code:: bash
>> +
>> +       $ qemu-img create -f qcow2 drive0.inc0.qcow2 \
>> +         -b drive0.full.qcow2 -F qcow2
>> +
>> +#. Attempt to create an incremental backup via QMP:
>> +
>> +   .. code:: json
>> +
>> +    -> {
>> +         "execute": "drive-backup",
>> +         "arguments": {
>> +           "device": "drive0",
>> +           "bitmap": "bitmap0",
>> +           "target": "drive0.inc0.qcow2",
>> +           "format": "qcow2",
>> +           "sync": "incremental",
>> +           "mode": "existing"
>> +         }
>> +       }
>> +
>> +    <- { "return": {} }
>> +
>> +#. Receive a pair of events indicating failure:
>> +
>> +   .. code:: json
>> +
>> +    <- {
>> +         "timestamp": {...},
>> +         "data": {
>> +           "device": "drive0",
>> +           "action": "report",
>> +           "operation": "read"
>> +         },
>> +         "event": "BLOCK_JOB_ERROR"
>> +       }
>> +
>> +    <- {
>> +         "timestamp": {...},
>> +         "data": {
>> +           "speed": 0,
>> +           "offset": 0,
>> +           "len": 67108864,
>> +           "error": "No space left on device",
>> +           "device": "drive0",
>> +           "type": "backup"
>> +         },
>> +         "event": "BLOCK_JOB_COMPLETED"
>> +       }
> 
> a bit strange to get ENOSPC on read (understand that you use blkdebug),
> consider s/read/write/ just for beauty
> 

You are absolutely right, this doesn't make sense in the context of
something you'd see in the wild.

>>   
>> -4. Delete the failed incremental, and re-create the image.
>> +#. Delete the failed image, and re-create it.
>>   
>>      .. code:: bash
> 
> [..]
> 
>> +
>> +Example: Partial Transactional Failures
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +QMP commands like `drive-backup <qemu-qmp-ref.html#index-drive_002dbackup>`_
>> +conceptually only start a job, and so transactions containing these commands
>> +may succeed even if the job it created later fails. This might have 
>> surprising
>> +interactions with notions of how a "transaction" ought to behave.
>> +
>> +This distinction means that on occasion, a transaction containing such job
>> +launching commands may appear to succeed and return success, but later
>> +individual jobs associated with the transaction may fail. It is possible 
>> that
>> +a management application may have to deal with a partial backup failure 
>> after
>> +a "successful" transaction.
>> +
>> +If multiple backup jobs are specified in a single transaction, if one of 
>> those
>> +jobs fails, it will not interact with the other backup jobs in any way by
>> +default. The job(s) that succeeded will clear the dirty bitmap associated 
>> with
>> +the operation, but the job(s) that failed will not. It is therefore not safe
>> +to delete any incremental backups that were created successfully in this
>> +scenario, even though others failed.
>> +
>> +This example illustrates a transaction with two backup jobs, where one fails
>> +and one succeeds:
>> +
>> +#. Issue the transaction to start a backup of both drives. Note that the
>> +   transaction is accepted, indicating that the jobs are started 
>> succesfully.
> 
> And now this "note" looks strange, as it related not to this list item but to 
> the next.
> Actually, we can't "note" that in context of this list item.
> 
>>   
>>      .. code:: json
>>   
>> -       { "execute": "transaction",
>> +    -> {
>> +         "execute": "transaction",
>>            "arguments": {
>>              "actions": [
>> -             { "type": "drive-backup",
>> -               "data": { "device": "drive0", "bitmap": "bitmap0",
>> -                         "format": "qcow2", "mode": "existing",
>> -                         "sync": "incremental", "target": "d0-incr-1.qcow2" 
>> } },
>> -             { "type": "drive-backup",
>> -               "data": { "device": "drive1", "bitmap": "bitmap1",
>> -                         "format": "qcow2", "mode": "existing",
>> -                         "sync": "incremental", "target": "d1-incr-1.qcow2" 
>> } },
>> -           ]
>> +           {
>> +             "type": "drive-backup",
>> +             "data": {
>> +               "device": "drive0",
>> +               "bitmap": "bitmap0",
>> +               "format": "qcow2",
>> +               "mode": "existing",
>> +               "sync": "incremental",
>> +               "target": "drive0.inc0.qcow2"
>> +             }
>> +           },
>> +           {
>> +             "type": "drive-backup",
>> +             "data": {
>> +               "device": "drive1",
>> +               "bitmap": "bitmap0",
>> +               "format": "qcow2",
>> +               "mode": "existing",
>> +               "sync": "incremental",
>> +               "target": "drive1.inc0.qcow2"
>> +             }
>> +           }]
>>            }
>>          }
>>   
>> --  QMP example response, highlighting one success and one failure:
>> +#. Receive notice that the Transaction was accepted, and jobs were
>> +   launched:
>>   
>> -   -  Acknowledgement that the Transaction was accepted and jobs were
>> -      launched:
>> +   .. code:: json
>> +
>> +    <- { "return": {} }
>> +
>> +#. Receive notice that the first job has completed:
>>   
> 
> [..]
> 
>> -      .. code:: json
>> +At the conclusion of the above example, ``drive0.inc0.qcow2`` is valid and
>> +must be kept, but ``drive1.inc0.qcow2`` is incomplete and should be
>> +deleted. If a VM-wide incremental backup of all drives at a point-in-time is
>> +to be made, new backups for both drives will need to be made, taking into
>> +account that a new incremental backup for drive0 needs to be based on top of
>> +``drive0.inc0.qcow2``.
>>   
>> -          { "timestamp": { "seconds": 1447192399, "microseconds": 683015 },
>> -            "data": { "device": "drive1", "action": "report",
>> -                      "operation": "read" },
>> -            "event": "BLOCK_JOB_ERROR" }
>> +In other words, at the conclusion of the above example, we'd have made only 
>> an
>> +incremental backup for drive0 but not drive1. The last VM-wide crash
>> +consistent backup we have access to in this case is the anchor point.
> 
> Not sure about "anchor point", can't be it applied to incremental backup 
> point?
> May be just "full-backup point" or something like this?
> 

You're right, I don't really use this term too consistently. I always
refer to them as "anchors" because they're at "the bottom of the chain"
but I should at least explain this terminology if I am going to use it.

>>   
>> -      .. code:: json
>> +.. code:: text
>>   
>> -          { "timestamp": { "seconds": 1447192399, "microseconds":
>> -          685853 }, "data": { "speed": 0, "offset": 0, "len": 67108864,
>> -          "error": "Input/output error", "device": "drive1", "type":
>> -          "backup" }, "event": "BLOCK_JOB_COMPLETED" }
>> +          [drive0.full.qcow2] <-- [drive0.inc0.qcow2]
>> +          [drive1.full.qcow2]
>>   
>> --  In the above example, ``d0-incr-1.qcow2`` is valid and must be kept,
>> -   but ``d1-incr-1.qcow2`` is invalid and should be deleted. If a VM-wide
>> -   incremental backup of all drives at a point-in-time is to be made,
>> -   new backups for both drives will need to be made, taking into account
>> -   that a new incremental backup for drive0 needs to be based on top of
>> -   ``d0-incr-1.qcow2``.
>> +To repair this, issue a new incremental backup across both drives. The 
>> result
>> +will be backup chains that resemble the following:
>>   
> 
> [..]
> 
> With or without my nit-picking suggestions:
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <address@hidden>
> 

They're good suggestions, so I will revise them. Thank you for reviewing
such a large document!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]