qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Outreachy project task: Adding QEMU block layer APIs resembling Linu


From: Damien Le Moal
Subject: Re: Outreachy project task: Adding QEMU block layer APIs resembling Linux ZBD ioctls.
Date: Mon, 30 May 2022 05:38:35 +0000

On 2022/05/30 14:09, Sam Li wrote:
> Hi everyone,

Hi Sam,

> I'm Sam Li, working on the Outreachy project which is to add zoned
> device support to QEMU's virtio-blk emulation.
> 
> For the first goal, adding QEMU block layer APIs resembling Linux ZBD
> ioctls, I think the naive approach would be to introduce a new stable
> struct zbd_zone descriptor for the library function interface. More
> specifically, what I'd like to add to the BlockDriver struct are:
> 1. zbd_info as zone block device information: includes numbers of
> zones, size of logical blocks, and physical blocks.

Virtio block devices only advertise a "block size" (blk_size field of struct
virtio_blk_config). So I do not think that you need to distinguish between
logical and physical blocks. However, for zoned devices, we need to add a "write
granularity" field which indicates the minimum write size and alignment. This is
to be able to handle 512e SMR disk drives as these have a 512 B logical block
size and 4096 B physical block size. And SMR only allows writing in units of
physical block size, regardless of the LBA size. For NVMe ZNS devices, there is
no logical/physical block size difference, so the write granularity will always
be equal to the block size.

> 2. zbd_zone_type and zbd_zone_state

As a first step, I would recommend to only have the zone type. That will allow
you to not issue a zone ioctl that you know will fail, e.g. if the user tries to
reset a conventional zone, we know this will fail, so no point in executing the
BLKRESETZONE ioctl for that zone. With the zone type cached, you can easily
catch such cases. But even that is actually optional as a first step. You can
rely on the host device failing any invalid operation and return the errors back
to the guest.

Once you have an API working and the ability to execute all zone operations from
a guest, you can then work on adding the more difficult bits: supporting the
zone append operation will require tracking the write pointer position and state
of the device sequential zones. For that, the zone information will need the
zone capacity and write pointer position of all zones. The zone state may not be
necessary as you can infer the empty and full states from the zone capacity and
zone write pointer position. States such as explicit/implicit open, closed,
read-only and offline do not need to be tracked. If an operation cannot be
executed, the device will fail the io on the host side and you can simply
propagate that error to the guest.

See the Linux kernel sd_zbc driver and its emulation of zone append operations
for inspiration: drivers/scsi/sd_zbc.c. Looking at that code (e.g.
sd_zbc_prepare_zone_append()), you will see that the only thing being tracked is
the write pointer position of zones (relative to the zone start sector). The
state is inferred from that value, indicating if the zone can be written (it is
not full) or not (the zone is full).

> 3. zbd_dev_model: host-managed zbd, host-aware zbd

Yes. The current virtio specs draft adding zoned block device support adds
struct virtio_blk_zoned_characteristics. Most, if not all, of the fields in that
structure can be kept as part fot the device zone information.

> With those basic structs, we can start to implement new functions as
> bdrv*() APIs for BLOCK*ZONE ioctls.

BLK*ZONE :)

> 
> I'll start to finish this task based on the above description. If
> there is any problem or something I may miss in the design, please let
> me know.

Supporting all operations will be difficult to do in one go. But as explained
above, if you initially exclude zone append support, you will not need to
dynamically track zone state and wp. This will simplify the code to give you a
solid working base to build upon the remaining support.

> 
> Best regards,
> Sam
> 


-- 
Damien Le Moal
Western Digital Research



reply via email to

[Prev in Thread] Current Thread [Next in Thread]