On Wed, Mar 03, 2021 at 01:47:06PM -0500, Jason Dillaman wrote:
> On Wed, Mar 3, 2021 at 12:41 PM Stefano Garzarella <sgarzare@redhat.com>
wrote:
> >
> > Hi Jason,
> > as reported in this BZ [1], when qemu-img creates a QCOW2 image on RBD
> > writing data is very slow compared to a raw file.
> >
> > Comparing raw vs QCOW2 image creation with RBD I found that we use a
> > different object size, for the raw file I see '4 MiB objects', for QCOW2
> > I see '64 KiB objects' as reported on comment 14 [2].
> > This should be the main issue of slowness, indeed forcing in the code 4
> > MiB object size also for QCOW2 increased the speed a lot.
> >
> > Looking better I discovered that for raw files, we call rbd_create()
> > with obj_order = 0 (if 'cluster_size' options is not defined), so the
> > default object size is used.
> > Instead for QCOW2, we use obj_order = 16, since the default
> > 'cluster_size' defined for QCOW2, is 64 KiB.
> >
> > Using '-o cluster_size=2M' with qemu-img changed only the qcow2 cluster
> > size, since in qcow2_co_create_opts() we remove the 'cluster_size' from
> > QemuOpts calling qemu_opts_to_qdict_filtered().
> > For some reason that I have yet to understand, after this deletion,
> > however remains in QemuOpts the default value of 'cluster_size' for
> > qcow2 (64 KiB), that it's used in qemu_rbd_co_create_opts()
> >
> > At this point my doubts are:
> > Does it make sense to use the same cluster_size as qcow2 as object_size
> > in RBD?
>
> No, not really. But it also doesn't really make any sense to put a
> QCOW2 image within an RBD image. To clarify from the BZ, OpenStack
> does not put QCOW2 images on RBD, it converts QCOW2 images into raw
> images to store in RBD.
Yes, that was my doubt, thanks for the confirmation.
Also Daniel (+CC) confirmed me the same thing, but just to be complete he
added that there is a case where OpenStack could use qcow2 on RBD, but in
this case using in-kernel RBD, so the QEMU RBD is not involved.
>
> > If we want to keep the 2 options separated, how can it be done? Should
> > we rename the option in block/rbd.c?
>
> You can already pass overrides to the RBD block driver by just
> appending them after the
> "rbd:<filename>[:option1=value1[:option2=value2]]" portion, perhaps
> that could be re-used.
I see, we should extend qemu_rbd_parse_filename() to suppurt it.