[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3 1/9] block: add .bdrv_need_rw_file_child_duri
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [PATCH v3 1/9] block: add .bdrv_need_rw_file_child_during_reopen_rw handler |
Date: |
Fri, 2 Aug 2019 17:42:56 +0200 |
User-agent: |
Mutt/1.11.3 (2019-02-01) |
Am 31.07.2019 um 14:09 hat Max Reitz geschrieben:
> On 25.07.19 11:18, Vladimir Sementsov-Ogievskiy wrote:
> > On reopen to rw parent may need rw access to child in .prepare, for
> > example qcow2 needs to write IN_USE flags into stored bitmaps
> > (currently it is done in a hacky way after commit and don't work).
> > So, let's introduce such logic.
> >
> > The drawback is that in worst case bdrv_reopen_set_read_only may finish
> > with error and in some intermediate state: some nodes reopened RW and
> > some are not. But this is a way to fix bug around reopening qcow2
> > bitmaps in the following commits.
>
> This commit message doesn’t really explain what this patch does.
>
> > Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
> > ---
> > include/block/block_int.h | 2 +
> > block.c | 144 ++++++++++++++++++++++++++++++++++----
> > 2 files changed, 133 insertions(+), 13 deletions(-)
> >
> > diff --git a/include/block/block_int.h b/include/block/block_int.h
> > index 3aa1e832a8..7bd6fd68dd 100644
> > --- a/include/block/block_int.h
> > +++ b/include/block/block_int.h
> > @@ -531,6 +531,8 @@ struct BlockDriver {
> > uint64_t parent_perm, uint64_t parent_shared,
> > uint64_t *nperm, uint64_t *nshared);
> >
> > + bool (*bdrv_need_rw_file_child_during_reopen_rw)(BlockDriverState
> > *bs);
> > +
> > /**
> > * Bitmaps should be marked as 'IN_USE' in the image on reopening image
> > * as rw. This handler should realize it. It also should unset readonly
> > diff --git a/block.c b/block.c
> > index cbd8da5f3b..3c8e1c59b4 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -1715,10 +1715,12 @@ static void
> > bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
> > uint64_t *shared_perm);
> >
> > typedef struct BlockReopenQueueEntry {
> > - bool prepared;
> > - bool perms_checked;
> > - BDRVReopenState state;
> > - QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry;
> > + bool reopened_file_child_rw;
> > + bool changed_file_child_perm_rw;
> > + bool prepared;
> > + bool perms_checked;
> > + BDRVReopenState state;
> > + QSIMPLEQ_ENTRY(BlockReopenQueueEntry) entry;
> > } BlockReopenQueueEntry;
> >
> > /*
> > @@ -3421,6 +3423,105 @@ BlockReopenQueue
> > *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
> > keep_old_opts);
> > }
> >
> > +static int bdrv_reopen_set_read_only_drained(BlockDriverState *bs,
> > + bool read_only,
> > + Error **errp)
> > +{
> > + BlockReopenQueue *queue;
> > + QDict *opts = qdict_new();
> > +
> > + qdict_put_bool(opts, BDRV_OPT_READ_ONLY, read_only);
> > +
> > + queue = bdrv_reopen_queue(NULL, bs, opts, true);
> > +
> > + return bdrv_reopen_multiple(queue, errp);
> > +}
> > +
> > +/*
> > + * handle_recursive_reopens
> > + *
> > + * On fail it needs rollback_recursive_reopens to be called.
>
> It would be nice if this description actually said anything about what
> the function is supposed to do.
>
> > + */
> > +static int handle_recursive_reopens(BlockReopenQueueEntry *current,
> > + Error **errp)
> > +{
> > + int ret;
> > + BlockDriverState *bs = current->state.bs;
> > +
> > + /*
> > + * We use the fact that in reopen-queue children are always following
> > + * parents.
> > + * TODO: Switch BlockReopenQueue to be QTAILQ and use
> > + * QTAILQ_FOREACH_REVERSE.
>
> Why don’t you do that first? It would make the code more obvious at
> least to me.
>
> > + */
> > + if (QSIMPLEQ_NEXT(current, entry)) {
> > + ret = handle_recursive_reopens(QSIMPLEQ_NEXT(current, entry),
> > errp);
> > + if (ret < 0) {
> > + return ret;
> > + }
> > + }
> > +
> > + if ((current->state.flags & BDRV_O_RDWR) && bs->file && bs->drv &&
> > + bs->drv->bdrv_need_rw_file_child_during_reopen_rw &&
> > + bs->drv->bdrv_need_rw_file_child_during_reopen_rw(bs))
> > + {
> > + if (!bdrv_is_writable(bs->file->bs)) {
> > + ret = bdrv_reopen_set_read_only_drained(bs->file->bs, false,
> > errp);
>
> Hm. Sorry, I find this all a bit hard to understand. (No comments and
> all.)
>
> I understand that this is for an RO -> RW transition? Everything is
> still RO, but the parent will need an RW child before it transitions to
> RW itself.
>
>
> I’m going to be honest up front, I don’t like this very much. But I
> think it may be a reasonable solution for now.
>
> As I remember, the problem was that when reopening a qcow2 node from RO
> to RW, we need to write something in .prepare() (because it can fail),
> but naturally no .prepare() is called after any .commit(), so no matter
> the order of nodes in the ReopenQueue, the child node will never be RW
> by this point.
>
> Hm. To me that mostly means that making the whole reopen process a
> transaction was just a dream that turns out not to work.
This patch already looks somewhat complicated to me, and what you're
proposing below sounds another order of magnitude more complex.
But I think the important point is your last sentence above. Once we
acknowledge that we can't possibly maintain full transaction semantics,
I'll ask this naive question: What prevents us from just keeping the
additional write in .commit?
It is expected to work in the common case, so we're only talking about
the behaviour in error cases anyway. If something fails and we can't do
things in a transactionable way, we need to decide what the surprise
result will look like, because we can neither guarantee a rollback nor
successful completion.
If the write fails unexpectedly, and we end up with a qcow2 image that
is opened r/w, but has read-only bitmaps - wouldn't that be a reasonable
result? It seems much easier to explain than some dependency subchain
already being committed and the rest rolled back.
So, I guess my question is, what is the specifc scenario you're trying
to fix with this series (why isn't the final patch a test case that
would answer this question?), and are we sure that the cure isn't worse
than the disease?
Kevin
signature.asc
Description: PGP signature