[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC] Generic image streaming
From: |
Zhi Yong Wu |
Subject: |
Re: [Qemu-devel] [RFC] Generic image streaming |
Date: |
Tue, 27 Sep 2011 11:26:45 +0800 |
On Fri, Sep 23, 2011 at 11:57 PM, Stefan Hajnoczi
<address@hidden> wrote:
> Here is my generic image streaming branch, which aims to provide a way
> to copy the contents of a backing file into an image file of a running
> guest without requiring specific support in the various block drivers
> (e.g. qcow2, qed, vmdk):
>
> http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/image-streaming-api
>
> The tree does not provide full image streaming yet but I'd like to
> discuss the approach taken in the code. Here are the main points:
>
> The image streaming API is available through HMP and QMP commands. When
> streaming is started on a block device a coroutine is created to do the
> background I/O work. The coroutine can be cancelled.
>
> While the coroutine copies data from the backing file into the image
> file, the guest may be performing I/O to the image file. Guest reads do
> not conflict with streaming but guest writes require special handling.
> If the guest writes to a region of the image file that we are currently
> copying, then there is the potential to clobber the guest write with old
> data from the backing file.
>
> Previously I solved this in a QED-specific way by taking advantage of
> the serialization of allocating write requests. In order to do this
> generically we need to track in-flight requests and have the ability to
> queue I/O. Guest writes that affect an in-flight streaming copy
> operation must wait for that operation to complete before being issued.
> Streaming copy operations must skip overlapping regions of guest writes.
>
> One big difference to the QED image streaming implementation is that
> this generic implementation is not based on copy-on-read operations.
> Instead we do a sequence of bdrv_is_allocated() to find regions for
> streaming, followed by bdrv_co_read() and bdrv_co_write() in order to
Why is the api not bdrv_aio_readv/writev? In your branch, it seems
that you only modify bdrv_read/write. Does your branch currently only
support sync read/write mode?
> populate the image file.
>
> It turns out that generic copy-on-read is not an attractive operation
> because it requires using bounce buffers for every request. Kevin
> pointed out the case where a guest performs a read and pokes the data
> buffer before the read completes, copy-on-read would write out the
> modified memory into the image file unless we use a bounce buffer.
>
> There are a few pieces missing in my tree, which have mostly been solved
> in other places and just need to be reused:
> 1. Arbitration between guest and streaming requests (this is the only
> real new thing).
> 2. Efficient zero handling (skip writing those regions or mark them as
> zero clusters).
> 3. Queuing/dependencies when arbitration decides a request must wait.
> I'm taking a look at reusing Zhi Yong's block queue.
> 4. Rate-limiting to ensure streaming I/O does not impact the guest.
> Already exists in the QED-specific patches, it may make sense to
> extract common code that both migration and the block layer can use.
>
> Ideas or questions?
>
> Stefan
>
>
--
Regards,
Zhi Yong Wu
Re: [Qemu-devel] [RFC] Generic image streaming, Marcelo Tosatti, 2011/09/26
Re: [Qemu-devel] [RFC] Generic image streaming,
Zhi Yong Wu <=
Re: [Qemu-devel] [RFC] Generic image streaming, Zhi Yong Wu, 2011/09/27