[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: how to improve qcow performance?
From: |
Nir Soffer |
Subject: |
Re: how to improve qcow performance? |
Date: |
Wed, 21 Jul 2021 16:37:40 +0300 |
On Wed, Jul 21, 2021 at 3:20 PM Geraldo Netto <geraldonetto@gmail.com> wrote:
>
> Dear Nir/Friends,
>
> On Tue, 20 Jul 2021 at 11:34, Nir Soffer <nsoffer@redhat.com> wrote:
> >
> > On Thu, Jul 15, 2021 at 2:33 PM Geraldo Netto <geraldonetto@gmail.com>
> > wrote:
> > >
> > > Dear Friends,
> > >
> > > I beg your pardon for such a newbie question
> > > But I would like to better understand how to improve the qcow performance
> >
> > I guess you mean how to improve "qcow2" performance. If you use "qcow"
> > format the best way is to switch to "qcow2".
>
> I read here [1] there was a qcow3, but it seems that page is
> deprecated (last update on sept. 2016)
QCOW3 is qcow2 v3. There is a version field in the format, but for
some reason it is not exposed in qemu-img info.
You can inspect the headers with this minimal qcow2 parser:
https://github.com/nirs/qcow2-parser
$ python3 qcow2.py /var/tmp/fedora-32.qcow2
{
"backing_file_offset": 0,
"backing_file_size": 0,
"cluster_bits": 16,
"compatible_features": 0,
"crypt_method": 0,
"header_length": 0,
"incompatible_features": 0,
"l1_size": 12,
"l1_table_offset": 196608,
"magic": 1363560955,
"nb_snapshots": 0,
"refcount_order": 0,
"refcount_table_clusters": 1,
"refcount_table_offset": 65536,
"size": 6442450944,
"snapshots_offset": 0,
"version": 3
}
$ qemu-img info /var/tmp/fedora-32.qcow2
image: /var/tmp/fedora-32.qcow2
file format: qcow2
virtual size: 6 GiB (6442450944 bytes)
disk size: 1.55 GiB
cluster_size: 65536
Format specific information:
compat: 1.1
compression type: zlib
lazy refcounts: false
refcount bits: 16
corrupt: false
extended l2: false
You can use "comapt: 1.1" to identify qcow2 v3. qcow2 v2 have "compat: 0.10".
I think there is a complete parser elsewhere that gives move info.
> > > I was checking the qemu-img and it seems that the following parameters
> > > are the most relevant to optimise the performance, no?
> > >
> > > 'cache' is the cache mode used to write the output disk image, the valid
> > > options are: 'none', 'writeback' (default, except for convert),
> > > 'writethrough',
> > > 'directsync' and 'unsafe' (default for convert)
> > >
> > > Should I infer that directsync means bypass all the stack and write
> > > directly to the disk?
> >
> > 'directsync' is using direct I/O, but calls fsync() for every write. This is
> > the slowest way and does not make sense for converting images.
> >
> > 'none' uses direct I/O (O_DIRECT). This enables native async I/O (libaio)
> > which can give better performance in some cases.
> >
> > 'writeback' uses the page cache, considering the write complete when the
> > data is in the page cache, and reading data from the page cache. This is
> > likely to give the best performance, but is also likely to give inconsistent
> > performance and cause trouble for other applications.
> >
> > The kernel will write a huge amount of data to the page cache, and from time
> > to time try to flush a huge amount of data, which can cause long delays in
> > other processes accessing the same storage. It also pollutes the page cache
> > with data that may not be needed after the image is converted, for example
> > when you convert an image on one host, writing to shared storage, and the
> > image is used later on another host.
> >
> > 'writethrough' seems to use the pagecache, but it reports writes only after
> > data is flushed so it will be slow as 'directsync' for writing, and
> > can cause the
> > same issues with the page cache as 'writeback'.
> >
> > 'unsafe' (default for convert) means writes are never flushed to disk,
> > which is
> > unsafe when using in vm's -drive option, but completely safe when using in
> > qemu-img convert, since qemu-img completes the operation with fsync().
> >
> > The most important option for performance is -W (unordered writes).
> > For writing to block devices, it is up to 6 times faster. But it can cause
> > fragmentation so you may get faster copies but accessing the image
> > later may be slower.
>
> I see! Now I get it
>
> > Check this for example of -W usage:
> > https://bugzilla.redhat.com/1511891#c57
> >
> > Finally there is the -m option - the default value (8) gives good
> > performance,
> > but using -m 16 can be a little faster.
> >
> > > 'src_cache' is the cache mode used to read input disk images, the valid
> > > options are the same as for the 'cache' option
> > >
> > > I didn`t follow where should I look to check the 'cache' options :`(
> >
> > -t CACHE
> > Specifies the cache mode that should be used with the
> > (destination) file.
> > See the documentation of the emulator's -drive cache=...
> > option for allowed values.
> >
> > "See the documentation of the amulator -drive cache=" means see qemu(1).
> >
> > > I guess that using smaller files is more performance due to the
> > > reduced number of metadata to handle?
> >
> > What do you mean by smaller files?
>
> I mean, by reducing the size of a qcow image and distribute them among
> different NAS
> it would reduce the pressure on metadata updating of the qcow image
> and that would reflect in better performance, no? (it`s just an intuition)
I'm not sure how using different NAS will reduce metadata updates.
> Just to describe the scenario, we have an all cloud env. using
> kubernetes with longhorn
> and behind the scenes there are qcow images mapped for each block
> device exposed on kubernetes
> We are studying ways to optimise it and specially replace the NFS
> architecture that we have now (too slow for our needs)
Adding Gal, he works on a similar project, hosting openshift on top of RHV.
I know they had performance issues with qcow2 images on NFS, and had
better performance with raw images served by RHV iSCSI storage.
In general if you don't need the features qcow2 adds, like snapshots and
incremental backup, using raw images will give better performance.
Nir