qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 6/7] block/backup: teach backup_cow_with_boun


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once
Date: Mon, 12 Aug 2019 18:11:34 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 12.08.19 17:47, Vladimir Sementsov-Ogievskiy wrote:
> 12.08.2019 18:10, Max Reitz wrote:
>> On 10.08.19 21:31, Vladimir Sementsov-Ogievskiy wrote:
>>> backup_cow_with_offload can transfer more than one cluster. Let
>>> backup_cow_with_bounce_buffer behave similarly. It reduces the number
>>> of IO requests, since there is no need to copy cluster by cluster.
>>>
>>> Logic around bounce_buffer allocation changed: we can't just allocate
>>> one-cluster-sized buffer to share for all iterations. We can't also
>>> allocate buffer of full-request length it may be too large, so
>>> BACKUP_MAX_BOUNCE_BUFFER is introduced. And finally, allocation logic
>>> is to allocate a buffer sufficient to handle all remaining iterations
>>> at the point where we need the buffer for the first time.
>>>
>>> Bonus: get rid of pointer-to-pointer.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>> ---
>>>   block/backup.c | 65 +++++++++++++++++++++++++++++++-------------------
>>>   1 file changed, 41 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/block/backup.c b/block/backup.c
>>> index d482d93458..65f7212c85 100644
>>> --- a/block/backup.c
>>> +++ b/block/backup.c
>>> @@ -27,6 +27,7 @@
>>>   #include "qemu/error-report.h"
>>>   
>>>   #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
>>> +#define BACKUP_MAX_BOUNCE_BUFFER (64 * 1024 * 1024)
>>>   
>>>   typedef struct CowRequest {
>>>       int64_t start_byte;
>>> @@ -98,44 +99,55 @@ static void cow_request_end(CowRequest *req)
>>>       qemu_co_queue_restart_all(&req->wait_queue);
>>>   }
>>>   
>>> -/* Copy range to target with a bounce buffer and return the bytes copied. 
>>> If
>>> - * error occurred, return a negative error number */
>>> +/*
>>> + * Copy range to target with a bounce buffer and return the bytes copied. 
>>> If
>>> + * error occurred, return a negative error number
>>> + *
>>> + * @bounce_buffer is assumed to enough to store
>>
>> s/to/to be/
>>
>>> + * MIN(BACKUP_MAX_BOUNCE_BUFFER, @end - @start) bytes
>>> + */
>>>   static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>>                                                         int64_t start,
>>>                                                         int64_t end,
>>>                                                         bool 
>>> is_write_notifier,
>>>                                                         bool *error_is_read,
>>> -                                                      void **bounce_buffer)
>>> +                                                      void *bounce_buffer)
>>>   {
>>>       int ret;
>>>       BlockBackend *blk = job->common.blk;
>>> -    int nbytes;
>>> +    int nbytes, remaining_bytes;
>>>       int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
>>>   
>>>       assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>> -    bdrv_reset_dirty_bitmap(job->copy_bitmap, start, job->cluster_size);
>>> -    nbytes = MIN(job->cluster_size, job->len - start);
>>> -    if (!*bounce_buffer) {
>>> -        *bounce_buffer = blk_blockalign(blk, job->cluster_size);
>>> -    }
>>> +    bdrv_reset_dirty_bitmap(job->copy_bitmap, start, end - start);
>>> +    nbytes = MIN(end - start, job->len - start);
>>>   
>>> -    ret = blk_co_pread(blk, start, nbytes, *bounce_buffer, read_flags);
>>> -    if (ret < 0) {
>>> -        trace_backup_do_cow_read_fail(job, start, ret);
>>> -        if (error_is_read) {
>>> -            *error_is_read = true;
>>> +
>>> +    remaining_bytes = nbytes;
>>> +    while (remaining_bytes) {
>>> +        int chunk = MIN(BACKUP_MAX_BOUNCE_BUFFER, remaining_bytes);
>>> +
>>> +        ret = blk_co_pread(blk, start, chunk, bounce_buffer, read_flags);
>>> +        if (ret < 0) {
>>> +            trace_backup_do_cow_read_fail(job, start, ret);
>>> +            if (error_is_read) {
>>> +                *error_is_read = true;
>>> +            }
>>> +            goto fail;
>>>           }
>>> -        goto fail;
>>> -    }
>>>   
>>> -    ret = blk_co_pwrite(job->target, start, nbytes, *bounce_buffer,
>>> -                        job->write_flags);
>>> -    if (ret < 0) {
>>> -        trace_backup_do_cow_write_fail(job, start, ret);
>>> -        if (error_is_read) {
>>> -            *error_is_read = false;
>>> +        ret = blk_co_pwrite(job->target, start, chunk, bounce_buffer,
>>> +                            job->write_flags);
>>> +        if (ret < 0) {
>>> +            trace_backup_do_cow_write_fail(job, start, ret);
>>> +            if (error_is_read) {
>>> +                *error_is_read = false;
>>> +            }
>>> +            goto fail;
>>>           }
>>> -        goto fail;
>>> +
>>> +        start += chunk;
>>> +        remaining_bytes -= chunk;
>>>       }
>>>   
>>>       return nbytes;
>>> @@ -301,9 +313,14 @@ static int coroutine_fn backup_do_cow(BackupBlockJob 
>>> *job,
>>>               }
>>>           }
>>>           if (!job->use_copy_range) {
>>> +            if (!bounce_buffer) {
>>> +                size_t len = MIN(BACKUP_MAX_BOUNCE_BUFFER,
>>> +                                 MAX(dirty_end - start, end - dirty_end));
>>> +                bounce_buffer = blk_try_blockalign(job->common.blk, len);
>>> +            }
>>
>> If you use _try_, you should probably also check whether it succeeded.
> 
> Oops, you are right, of course.
> 
>>
>> Anyway, I wonder whether it’d be better to just allocate this buffer
>> once per job (the first time we get here, probably) to be of size
>> BACKUP_MAX_BOUNCE_BUFFER and put it into BackupBlockJob.  (And maybe add
>> a buf-size parameter similar to what the mirror jobs have.)
>>
> 
> Once per job will not work, as we may have several guest writes in parallel 
> and therefore
> several parallel copy-before-write operations.

Hm.  I’m not quite happy with that because if the guest just issues many
large discards in parallel, this means that qemu will allocate a large
amount of memory.

It would be nice if there was a simple way to keep track of the total
memory usage and let requests yield if they would exceed it.

> Or if you mean writing an allocator based
> on once-allocated buffer like in mirror, I really dislike this idea, as we 
> already have
> allocator: memalign/malloc/free and it works well, no reason to invent a new 
> one and
> hardcode it into block layer (look at my answer to Eric on v2 of this patch 
> for more info).

Well, at least it’d be something we can delay until blockdev-copy
arrives(TM).

Max

> Or, if you mean only backup_loop generated copy-requests, yes we may keep 
> only one buffer for them,
> but:
> 1. it is not how it works now, so my patch is not a degradation in this case
> 2. I'm going to parallelize backup loop too, like my series "qcow2: async 
> handling of fragmented io",
>     which will need several allocated buffers anyway.
> 
>>
>>>               ret = backup_cow_with_bounce_buffer(job, start, dirty_end,
>>>                                                   is_write_notifier,
>>> -                                                error_is_read, 
>>> &bounce_buffer);
>>> +                                                error_is_read, 
>>> bounce_buffer);
>>>           }
>>>           if (ret < 0) {
>>>               break;
>>>
>>
>>
> 
> 


Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]