[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v6 35/42] block: Fix check_to_replace_node()
From: |
Max Reitz |
Subject: |
Re: [Qemu-devel] [PATCH v6 35/42] block: Fix check_to_replace_node() |
Date: |
Thu, 15 Aug 2019 19:01:48 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 |
On 15.08.19 17:21, Vladimir Sementsov-Ogievskiy wrote:
> 09.08.2019 19:14, Max Reitz wrote:
>> Currently, check_to_replace_node() only allows mirror to replace a node
>> in the chain of the source node, and only if it is the first non-filter
>> node below the source. Well, technically, the idea is that you can
>> exactly replace a quorum child by mirroring from quorum.
>>
>> This has (probably) two reasons:
>> (1) We do not want to create loops.
>> (2) @replaces and @device should have exactly the same content so
>> replacing them does not cause visible data to change.
>>
>> This has two issues:
>> (1) It is overly restrictive. It is completely fine for @replaces to be
>> a filter.
>> (2) It is not restrictive enough. You can create loops with this as
>> follows:
>>
>> $ qemu-img create -f qcow2 /tmp/source.qcow2 64M
>> $ qemu-system-x86_64 -qmp stdio
>> {"execute": "qmp_capabilities"}
>> {"execute": "object-add",
>> "arguments": {"qom-type": "throttle-group", "id": "tg0"}}
>> {"execute": "blockdev-add",
>> "arguments": {
>> "node-name": "source",
>> "driver": "throttle",
>> "throttle-group": "tg0",
>> "file": {
>> "node-name": "filtered",
>> "driver": "qcow2",
>> "file": {
>> "driver": "file",
>> "filename": "/tmp/source.qcow2"
>> } } } }
>> {"execute": "drive-mirror",
>> "arguments": {
>> "job-id": "mirror",
>> "device": "source",
>> "target": "/tmp/target.qcow2",
>> "format": "qcow2",
>> "node-name": "target",
>> "sync" :"none",
>> "replaces": "filtered"
>> } }
>> {"execute": "block-job-complete", "arguments": {"device": "mirror"}}
>>
>> And qemu crashes because of a stack overflow due to the loop being
>> created (target's backing file is source, so when it replaces filtered,
>> it points to itself through source).
>>
>> (blockdev-mirror can be broken similarly.)
>>
>> So let us make the checks for the two conditions above explicit, which
>> makes the whole function exactly as restrictive as it needs to be.
>>
>> Signed-off-by: Max Reitz <address@hidden>
>> ---
>> include/block/block.h | 1 +
>> block.c | 83 +++++++++++++++++++++++++++++++++++++++----
>> blockdev.c | 34 ++++++++++++++++--
>> 3 files changed, 110 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/block/block.h b/include/block/block.h
>> index 6ba853fb90..8da706cd89 100644
>> --- a/include/block/block.h
>> +++ b/include/block/block.h
>> @@ -404,6 +404,7 @@ bool bdrv_is_first_non_filter(BlockDriverState
>> *candidate);
>>
>> /* check if a named node can be replaced when doing drive-mirror */
>> BlockDriverState *check_to_replace_node(BlockDriverState *parent_bs,
>> + BlockDriverState *backing_bs,
>> const char *node_name, Error
>> **errp);
>>
>> /* async block I/O */
>> diff --git a/block.c b/block.c
>> index 915b80153c..4858d3e718 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -6290,7 +6290,59 @@ bool bdrv_is_first_non_filter(BlockDriverState
>> *candidate)
>> return false;
>> }
>>
>> +static bool is_child_of(BlockDriverState *child, BlockDriverState *parent)
>> +{
>> + BdrvChild *c;
>> +
>> + if (!parent) {
>> + return false;
>> + }
>> +
>> + QLIST_FOREACH(c, &parent->children, next) {
>> + if (c->bs == child || is_child_of(child, c->bs)) {
>> + return true;
>> + }
>> + }
>> +
>> + return false;
>> +}
>> +
>> +/*
>> + * Return true if there are only filters in [@top, @base). Note that
>> + * this may include quorum (which bdrv_chain_contains() cannot
>> + * handle).
>
> More presizely: return true if exists chain of filters from top to base or if
> top == base.
>
> I keep in mind backup-top filter:
>
> [backup-top]
> | \target
backup-top can’t be a filter if it has two children with different
contents, though.
(commit-top and mirror-top aren’t filters either.)
That’s why there must be a unique chain [@top, @base).
I should probably not that it will return true if top == base, though, yes.
> |backing -------->[target]
> V /
> [source] <---------/backing
>
>> + */
>> +static bool is_filtered_child(BlockDriverState *top, BlockDriverState *base)
>> +{
>> + BdrvChild *c;
>> +
>> + if (!top) {
>> + return false;
>> + }
>> +
>> + if (top == base) {
>> + return true;
>> + }
>> +
>> + if (!top->drv->is_filter) {
>> + return false;
>> + }
>> +
>> + QLIST_FOREACH(c, &top->children, next) {
>> + if (is_filtered_child(c->bs, base)) {
>> + return true;
>> + }
>> + }
>
> interesting, how much is it better to somehow reuse DFS search written in
> should_update_child()..
> [just note, don't do it in these series please]
>
>> +
>> + return false;
>> +}
>> +
>> +/*
>> + * @parent_bs is mirror's source BDS, @backing_bs is the BDS which
>> + * will be attached to the target when mirror completes.
>> + */
>> BlockDriverState *check_to_replace_node(BlockDriverState *parent_bs,
>> + BlockDriverState *backing_bs,
>> const char *node_name, Error
>> **errp)
>> {
>> BlockDriverState *to_replace_bs = bdrv_find_node(node_name);
>> @@ -6309,13 +6361,32 @@ BlockDriverState
>> *check_to_replace_node(BlockDriverState *parent_bs,
>> goto out;
>> }
>>
>> - /* We don't want arbitrary node of the BDS chain to be replaced only
>> the top
>> - * most non filter in order to prevent data corruption.
>> - * Another benefit is that this tests exclude backing files which are
>> - * blocked by the backing blockers.
>> + /*
>> + * If to_replace_bs is (recursively) a child of backing_bs,
>> + * replacing it may create a loop. We cannot allow that.
>> */
>> - if (!bdrv_recurse_is_first_non_filter(parent_bs, to_replace_bs)) {
>> - error_setg(errp, "Only top most non filter can be replaced");
>> + if (to_replace_bs == backing_bs || is_child_of(to_replace_bs,
>> backing_bs)) {
>
> first condition is covered by second, so first may be omitted.
It is not. is_child_of() does not return true if child == parent.
>> + error_setg(errp, "Replacing this node would result in a loop");
>> + to_replace_bs = NULL;
>> + goto out;
>> + }
>> +
>> + /*
>> + * Mirror is designed in such a way that when it completes, the
>> + * source BDS is seamlessly replaced.
>
> Not source but to_replace_bs is replaced?
It has originally been designed to replace the source. If it could
replace any arbitrary BDS, all of this would be moot.
>> It is therefore not allowed
>> + * to replace a BDS where this condition would be violated, as that
>> + * would defeat the purpose of mirror and could lead to data
>> + * corruption.
>> + * Therefore, between parent_bs and to_replace_bs there may be
>> + * only filters (and the one on top must be a filter, too), so
>> + * their data always stays in sync and mirror can complete and
>> + * replace to_replace_bs without any possible corruptions.
>> + */
>> + if (!is_filtered_child(parent_bs, to_replace_bs) &&
>> + !is_filtered_child(to_replace_bs, parent_bs))
>> + {
>> + error_setg(errp, "The node to be replaced must be connected to the "
>> + "source through filter nodes only");
>
> "and the one on top must be a filter, too" not mentioned in the error..
Well, unless the source node is the node to be replaced. Hm... This
gets very hard to express. I think I’d prefer to keep this as it is,
even though it is not quite correct, unless you have a better suggestion
of what to report. :-/
>> to_replace_bs = NULL;
>> goto out;
>> }
>> diff --git a/blockdev.c b/blockdev.c
>> index 4e72f6f701..758e0b5431 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -3887,7 +3887,7 @@ static void blockdev_mirror_common(const char *job_id,
>> BlockDriverState *bs,
>> }
>>
>> if (has_replaces) {
>> - BlockDriverState *to_replace_bs;
>> + BlockDriverState *to_replace_bs, *backing_bs;
>> AioContext *replace_aio_context;
>> int64_t bs_size, replace_size;
>>
>> @@ -3897,7 +3897,37 @@ static void blockdev_mirror_common(const char
>> *job_id, BlockDriverState *bs,
>> return;
>> }
>>
>> - to_replace_bs = check_to_replace_node(bs, replaces, errp);
>> + if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
>> + backing_mode == MIRROR_OPEN_BACKING_CHAIN)
>> + {
>> + /*
>> + * While we do not quite know what OPEN_BACKING_CHAIN
>> + * (used for mode=existing) will yield, it is probably
>> + * best to restrict it exactly like SOURCE_BACKING_CHAIN,
>> + * because that is our best guess.
>> + */
>> + switch (sync) {
>> + case MIRROR_SYNC_MODE_FULL:
>> + backing_bs = NULL;
>> + break;
>> +
>> + case MIRROR_SYNC_MODE_TOP:
>> + backing_bs = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs));
>
> why not bdrv_backing_chain_next(bs) like in mirror_start?
Good question. I suppose it should be
bdrv_filtered_cow_bs(bdrv_backing_chain_next(bs)) in mirror_start()?
Because with sync=top, we just want to remove the topmost COW node (and
filters on top), but keep filters behind it.
Max
>> + break;
>> +
>> + case MIRROR_SYNC_MODE_NONE:
>> + backing_bs = bs;
>> + break;
>> +
>> + default:
>> + abort();
>> + }
>> + } else {
>> + assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
>> + backing_bs = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(target));
>> + }
>> +
>> + to_replace_bs = check_to_replace_node(bs, backing_bs, replaces,
>> errp);
>> if (!to_replace_bs) {
>> return;
>> }
>>
>
>
signature.asc
Description: OpenPGP digital signature
- Re: [Qemu-devel] [PATCH v6 30/42] qemu-img: Use child access functions, (continued)
- [Qemu-devel] [PATCH v6 31/42] block: Drop backing_bs(), Max Reitz, 2019/08/09
- [Qemu-devel] [PATCH v6 32/42] block: Make bdrv_get_cumulative_perm() public, Max Reitz, 2019/08/09
- [Qemu-devel] [PATCH v6 33/42] blockdev: Fix active commit choice, Max Reitz, 2019/08/09
- [Qemu-devel] [PATCH v6 38/42] iotests: Let complete_and_wait() work with commit, Max Reitz, 2019/08/09
- [Qemu-devel] [PATCH v6 34/42] block: Inline bdrv_co_block_status_from_*(), Max Reitz, 2019/08/09
- [Qemu-devel] [PATCH v6 36/42] iotests: Add tests for mirror @replaces loops, Max Reitz, 2019/08/09
- [Qemu-devel] [PATCH v6 35/42] block: Fix check_to_replace_node(), Max Reitz, 2019/08/09
[Qemu-devel] [PATCH v6 37/42] block: Leave BDS.backing_file constant, Max Reitz, 2019/08/09
[Qemu-devel] [PATCH v6 39/42] iotests: Add filter commit test cases, Max Reitz, 2019/08/09
[Qemu-devel] [PATCH v6 40/42] iotests: Add filter mirror test cases, Max Reitz, 2019/08/09