Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix

From:	Max Reitz
Subject:	Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard
Date:	Fri, 12 Mar 2021 16:10:00 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0

On 12.03.21 13:46, Vladimir Sementsov-Ogievskiy wrote:

12.03.2021 15:32, Vladimir Sementsov-Ogievskiy wrote:
12.03.2021 14:17, Max Reitz wrote:
On 12.03.21 10:09, Vladimir Sementsov-Ogievskiy wrote:
11.03.2021 22:58, Max Reitz wrote:
On 05.03.21 18:35, Vladimir Sementsov-Ogievskiy wrote:
There is a bug in qcow2: host cluster can be discarded (refcount
becomes 0) and reused during data write. In this case data write may
[..]
@@ -885,6 +1019,13 @@ static int QEMU_WARN_UNUSED_RESULTupdate_refcount(BlockDriverState *bs,
          if (refcount == 0) {
              void *table;
+ Qcow2InFlightRefcount *infl = find_infl_wr(s,cluster_index);
+
+            if (infl) {
+                infl->refcount_zero = true;
+                infl->type = type;
+                continue;
+            }
I don’t understand what this is supposed to do exactly. It seemslike it wants to keep metadata structures in the cache that arestill in use (because dropping them from the caches is what happensnext), but users of metadata structures won’t set in-flightcounters for those metadata structures, will they?
Don't follow.
We want the code in "if (refcount == 0)" to be triggered only whenfull reference count of the host cluster becomes 0, includinginflight-write-cnt. So, if at this point inflight-write-cnt is not0, we postpone freeing the host cluster, it will be done later from"slow path" in update_inflight_write_cnt().
But the code under “if (refcount == 0)” doesn’t free anything, doesit? All I can see is code to remove metadata structures from themetadata caches (if the discarded cluster was an L2 table or arefblock), and finally the discard on the underlying file. I don’tsee how that protocol-level discard has anything to do with ourproblem, though.
Hmm. Still, if we do this discard, and then our in-flight write, we'llhave data instead of a hole. Not a big deal, but seems better topostpone discard.
On the other hand, clearing caches is OK, as its related only toqcow2-refcount, not to inflight-write-cnt
As far as I understand, the freeing happens immediately above the “if(refcount == 0)” block by s->set_refcount() setting the refcount to0. (including updating s->free_cluster_index if the refcount is 0).
Hmm.. And that (setting s->free_cluster_index) what I should actuallyprevent until total reference count becomes zero.
And about s->set_refcount(): it only update a refcount itself, anddon't free anything.
So, it is more correct like this:

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 464d133368..1da282446d 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1012,21 +1012,12 @@ static int QEMU_WARN_UNUSED_RESULTupdate_refcount(BlockDriverState *bs,
          } else {
              refcount += addend;
          }
-        if (refcount == 0 && cluster_index < s->free_cluster_index) {
-            s->free_cluster_index = cluster_index;
-        }
          s->set_refcount(refcount_block, block_index, refcount);

          if (refcount == 0) {
              void *table;
              Qcow2InFlightRefcount *infl = find_infl_wr(s, cluster_index);

-            if (infl) {
-                infl->refcount_zero = true;
-                infl->type = type;
-                continue;
-            }
-
              table = qcow2_cache_is_table_offset(s->refcount_block_cache,
                                                  offset);
              if (table != NULL) {
@@ -1040,6 +1031,16 @@ static int QEMU_WARN_UNUSED_RESULTupdate_refcount(BlockDriverState *bs,
                  qcow2_cache_discard(s->l2_table_cache, table);
              }

+            if (infl) {
+                infl->refcount_zero = true;
+                infl->type = type;
+                continue;
+            }
+
+            if (cluster_index < s->free_cluster_index) {
+                s->free_cluster_index = cluster_index;
+            }
+
              if (s->discard_passthrough[type]) {
update_refcount_discard(bs, cluster_offset,s->cluster_size);
              }

I don’t think I like using s->free_cluster_index as a protection againstallocating something before it.

First, it comes back the problem I just described in my mail from 15:58GMT+1, which is that you’re changing the definition of what a freecluster is. With this proposal, you’re proposing yet a new definition:A free cluster is anything with refcount == 0 after free_cluster_index.

Now looking only at the allocation functions, it may look like that kindof is the definition already. But I don’t think that was the intentionwhen free_cluster_index was introduced, so we’d have to check everyplace that sets free_cluster_index, to see whether it adheres to thisdefinition.

And I think it’s clear that there is a place that won’t adhere to thisdefinition, and that is this very place here, in update_refcount(). Sayfree_cluster_index is 42. Then you free cluster 39, but there is awrite to it, so free_cluster_index isn’t update. Then you free cluster38, and there are writes to that cluster, so free_cluster_index isupdated to 38. Suddenly, 39 is free to be allocated, too.

(The precise problem is that with this new definition decreasingfree_cluster_index suddenly has the power to free any cluster betweenits new and all value. With the old definition, changingfree_cluster_index would never free any cluster. So when you decreasefree_cluster_index, you suddenly have to be sure that all clustersbetween the new and old value that have refcount 0 are indeed to beconsidered free.)

Max

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, (continued)
- [PATCH v3 2/6] iotests: add qcow2-discard-during-rewrite, Vladimir Sementsov-Ogievskiy, 2021/03/05
- [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/05
  - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Max Reitz, 2021/03/11
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Max Reitz, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Max Reitz, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Max Reitz <=
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Max Reitz, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Max Reitz, 2021/03/12
    - Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
- [PATCH v3 4/6] util: implement seqcache, Vladimir Sementsov-Ogievskiy, 2021/03/05
  - Re: [PATCH v3 4/6] util: implement seqcache, Max Reitz, 2021/03/12
    - Re: [PATCH v3 4/6] util: implement seqcache, Vladimir Sementsov-Ogievskiy, 2021/03/12
    - Re: [PATCH v3 4/6] util: implement seqcache, Max Reitz, 2021/03/12
- [PATCH v3 5/6] block-coroutine-wrapper: allow non bdrv_ prefix, Vladimir Sementsov-Ogievskiy, 2021/03/05

Prev by Date: Re: [PULL v2 36/38] hw/block/nvme: support namespace attachment command
Next by Date: Re: [PULL v2 19/38] hw/block/nvme: align zoned.zasl with mdts
Previous by thread: Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard
Next by thread: Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard
Index(es):
- Date
- Thread