qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ANNOUNCE] libblkio v0.1.0 preview release


From: Stefan Hajnoczi
Subject: Re: [ANNOUNCE] libblkio v0.1.0 preview release
Date: Thu, 13 May 2021 10:47:22 +0100

On Thu, May 06, 2021 at 12:33:24PM +0200, Kevin Wolf wrote:
> Am 06.05.2021 um 10:46 hat Stefan Hajnoczi geschrieben:
> > On Wed, May 05, 2021 at 06:46:36PM +0200, Kevin Wolf wrote:
> > > Am 05.05.2021 um 18:19 hat Stefan Hajnoczi geschrieben:
> > > > On Tue, May 04, 2021 at 03:44:23PM +0200, Kevin Wolf wrote:
> > > > > Am 30.04.2021 um 17:49 hat Stefan Hajnoczi geschrieben:
> > > > > > On Thu, Apr 29, 2021 at 05:51:16PM +0200, Kevin Wolf wrote:
> > > > > > > Am 29.04.2021 um 16:05 hat Stefan Hajnoczi geschrieben:
> > > > > There is one more thing I'm wondering right now: Why do we have 
> > > > > separate
> > > > > states for connecting to the backend (created) and configuring it
> > > > > (initialized)? The property setters check for the right state, but 
> > > > > they
> > > > > don't really do anything that wouldn't be possible in the other state.
> > > > > A state machine exposed as two boolean rather than a tristate enum 
> > > > > feels
> > > > > a bit awkward, too, but even more so if two states could possibly be
> > > > > enough.
> > > > > 
> > > > > The reason why I'm asking is that in order to address my points, it
> > > > > would be best to have separate property accessors for each state, and
> > > > > two pairs of accessors would make property declarations more managable
> > > > > than three pairs.
> > > > 
> > > > There is no need to have separate boolean properties, it's just how I
> > > > implemented it. There could be a single state property. For example, a
> > > > string that is "uninitialized", "initialized", and "started".
> > > 
> > > Right. I think this would make the way the API works a bit more obvious,
> > > though it's really only a small detail.
> > > 
> > > However, my real question was why we even need those three distinct
> > > states. From what I could see so far in the io_uring implemtation,
> > > "uninitialized" and "started" would be enough. Everything that can be
> > > configured in "initialized", could just as well be configured in
> > > "uninitialized" and the state transition would both open the image file
> > > and apply the configuration in a single step.
> > > 
> > > Do you have intentions to add features in the future where the three
> > > separate states are actually needed because the user needs to be able to
> > > do something while the image file is already opened, but queue
> > > processing hasn't started yet?
> > 
> > Yes, three states will be needed. Think of a virtio-blk driver where a
> > VFIO PCI bus address or a vhost-user-blk UNIX domain socket path are
> > needed to connect to the device. After connection completes you then
> > have access to the block queue limits, the number of queues, etc. Once
> > those things are configured the device can be started.
> 
> You mean, the user of the library would first query some limits before
> deciding on the right values to use for some properties?
> 
> When the values come directly from the command line, this won't be the
> case anyway, but I agree there may be cases where not the user, but the
> application decides and then the separation would be helpful.
> 
> > > > > > > Alternatives in QEMU are qdev properties (which are internally QOM
> > > > > > > properties, but provide default implementations and are at least
> > > > > > > automatically read-only after realize, avoiding that whole class 
> > > > > > > of
> > > > > > > bugs) and QAPI.
> > > > > > > If this was QEMU code, I would of course go for QAPI, but a 
> > > > > > > library is
> > > > > > > something different and adding the code generator would probably 
> > > > > > > be a
> > > > > > > bit too much anyway. But the idea in the resulting code would be 
> > > > > > > dealing
> > > > > > > with native structs instead of a bunch of function calls. This 
> > > > > > > would
> > > > > > > probably be the least error prone way for the implementation, but 
> > > > > > > of
> > > > > > > course, it would make binary compatibility a bit harder when 
> > > > > > > adding new
> > > > > > > properties.
> > > > > > 
> > > > > > An alternative I considered was the typestate and builder patterns:
> > > > > > 
> > > > > >   /* Create a new io_uring driver in the uninitialized state */
> > > > > >   struct blkio_iou_uninit *blkio_new_io_uring(void);
> > > > > > 
> > > > > >   /* Uninitialized state property setters */
> > > > > >   int blkio_iou_uninit_set_path(struct blkio_iou_uninit *u,
> > > > > >                                 const char *path);
> > > > > >   int blkio_iou_uninit_set_direct(struct blkio_iou_uninit *u,
> > > > > >                                   bool o_direct);
> > > > > >   ...
> > > > > > 
> > > > > >   /* Transition to initialized state. Frees u on success. */
> > > > > >   struct blkio_iou_init *blkio_iou_init(struct blkio_iou_uninit *u);
> > > > > > 
> > > > > >   /* Initialized state property setters/getters */
> > > > > >   int blkio_iou_init_get_capacity(struct blkio_iou_init *i,
> > > > > >                                   uint64_t *capacity);
> > > > > >   ...
> > > > > > 
> > > > > >   /* Transition to started state. Frees i on success. */
> > > > > >   struct blkio_iou_started *blkio_iou_start(struct blkio_iou_init 
> > > > > > *i);
> > > > > > 
> > > > > >   ...
> > > > > > 
> > > > > >   /* Transition back to initialized state. Frees s on success. */
> > > > > >   struct blkio_iou_init *blkio_iou_stop(struct blkio_iou_started 
> > > > > > *s);
> > > > > > 
> > > > > > On the plus side:
> > > > > > 
> > > > > > - No state checks are needed because an API won't even exist if it's
> > > > > >   unavailable in a given state (uninitialized/initialized/started).
> > > > > > 
> > > > > > - State structs come with pre-initialized default values, so the 
> > > > > > caller
> > > > > >   only needs to set non-default values. For example O_DIRECT is 
> > > > > > false by
> > > > > >   default and callers happy with that don't need to set the 
> > > > > > property.
> > > > > > 
> > > > > > - ABI compatibility is easy since the state structs are opaque 
> > > > > > (their
> > > > > >   size is not defined) and new properties can be added at any time.
> > > > > > 
> > > > > > On the minus side:
> > > > > > 
> > > > > > - Completely static. Hard to introspect and requires a dedicated 
> > > > > > call
> > > > > >   site for each property (applications cannot simply assign a 
> > > > > > property
> > > > > >   string given to them on the command-line). This means every single
> > > > > >   property must be explicitly coded into every application :(.
> > > > > 
> > > > > How are you going to deal with this for QEMU integration, by the way?
> > > > > Put all the properties that we know into the QAPI schema and then some
> > > > > way of passing key/value pairs for the rest?
> > > > 
> > > > In QEMU's case let's define each property explicitly instead of passing
> > > > them through. That's due to QAPI's philosophy rather than libblkio.
> > > 
> > > Okay, so new features in libblkio would simply be unaccessible until we
> > > update the QAPI schema. Given the overlap in developers, this shouldn't
> > > be a problem in the foreseeable future, so this is fine with me.
> > 
> > Great.
> > 
> > > > > > - So many functions! This makes understanding the API harder.
> > > > > > 
> > > > > > - Very verbose. The function and type names get long and there is a 
> > > > > > lot
> > > > > >   of repetition in the API.
> > > > > 
> > > > > I think it wouldn't be too bad if all drivers exposed the same
> > > > > properties, but you're explicitly expecting driver-specific 
> > > > > properties.
> > > > > If drivers add an external APIs that just fail for other drivers, it
> > > > > would indeed make understanding the API much harder.
> > > > > 
> > > > > We could consider a mix where you would first create a configuration
> > > > > object, then use the generic property functions to set options for it
> > > > > and finally have a separate blkio_initialize() function where you turn
> > > > > that config into a struct blkio that is needed to actually do I/O (and
> > > > > also supports generic property functions for runtime option updates).
> > > > > 
> > > > > I'm not sure it provides much except making the state machine more
> > > > > prominent than just two random bool properties.
> > > > 
> > > > I prefer to keep the configuration public API as it is. We can change
> > > > the properties.rs implementation however we want though.
> > > > 
> > > > Do you think the public API should be a typestate API instead with
> > > > struct blkio_init_info, struct blkio_start_info, and struct blkio
> > > > expressing the 3 states instead?
> > > 
> > > I just mentioned it as a theoretical option for a middle ground. I'm not
> > > sure if it's a good idea, and probably not worth the effort to change
> > > it.
> > > 
> > > Maybe I would give the state transitions a separate function instead of
> > > making them look like normal properties (then they could also use enums
> > > instead of strings or two bools). But that's not a hard preference
> > > either.
> > 
> > What do you think about this:
> > 
> > The blkio instance states are:
> > 
> >   created -> attached -> started -> destroyed
> > 
> > It is not possible to go backwards anymore, which simplifies driver
> > implementations and it probably won't be needed by applications.
> > 
> > The "initialized" state is renamed to "attached" to make it clearer that
> > this means the block device has been connected/opened. Also
> > "initialized" can be confused with "created".
> > 
> > The corresponding APIs are:
> > 
> > int blkio_create(const char *driver, struct blkio **bp, char **errmsg);
> > int blkio_attach(struct blkio *bp, char **errmsg);
> > int blkio_start(struct blkio *bp, char **errmsg);
> > void blkio_destroy(struct blkio **bp);
> > 
> > There is no way to query the state here, but that probably isn't
> > necessary since an application setting up the blkio instance must
> > already be aware of the state in order to configure it in the first
> > place.
> > 
> > One advantage of this approach is that it can support network drivers
> > where the attach and start operations can take a long time while regular
> > property accesses do not block.
> 
> I like this.
> 
> For properties, I think, each property will have a first state in which
> it becomes available and then it will be available in all later states,
> too.
> 
> Currently, apart from properties that are always read-only, we only have
> properties that are rw only in their first state and become read-only in
> later states. It might be reasonable to assume that properties will
> exist that can be rw in all later states, too.
> 
> In their first state, most properties only store the value into the
> config and it's the next state transition that actually makes use of
> them. Similarly, reading usually only reads the value from the config.
> So these parts can be automatically covered. Usually you would then only
> need a custom implementation for property updates after the fact. I
> think this could simplify the driver implementations a lot. I'll play
> with this a bit more.

Hi Kevin,
I posted a patch that introduces blkio_connect() and blkio_start():
https://gitlab.com/libblkio/libblkio/-/merge_requests/4

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]