Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta(

From:	Eric Blake
Subject:	Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta()
Date:	Tue, 5 May 2020 16:59:18 -0500
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0

On 5/5/20 12:38 PM, Alberto Garcia wrote:

If an image has subclusters then there are more copy-on-write
scenarios that we need to consider. Let's say we have a write request
from the middle of subcluster #3 until the end of the cluster:

1) If we are writing to a newly allocated cluster then we need
    copy-on-write. The previous contents of subclusters #0 to #3 must
    be copied to the new cluster. We can optimize this process by
    skipping all leading unallocated or zero subclusters (the status of
    those skipped subclusters will be reflected in the new L2 bitmap).

2) If we are overwriting an existing cluster:

    2.1) If subcluster #3 is unallocated or has the all-zeroes bit set
         then we need copy-on-write (on subcluster #3 only).

    2.2) If subcluster #3 was already allocated then there is no need
         for any copy-on-write. However we still need to update the L2
         bitmap to reflect possible changes in the allocation status of
         subclusters #4 to #31. Because of this, this function checks
         if all the overwritten subclusters are already allocated and
         in this case it returns without creating a new QCowL2Meta
         structure.

After all these changes l2meta_cow_start() and l2meta_cow_end()
are not necessarily cluster-aligned anymore. We need to update the
calculation of old_start and old_end in handle_dependencies() to
guarantee that no two requests try to write on the same cluster.

Signed-off-by: Alberto Garcia <address@hidden>
---
  block/qcow2-cluster.c | 174 +++++++++++++++++++++++++++++++++++-------
  1 file changed, 146 insertions(+), 28 deletions(-)

-    /* Return if there's no COW (all clusters are normal and we keep them) */
+    /* Return if there's no COW (all subclusters are normal and we are
+     * keeping the clusters) */
      if (keep_old) {
+        unsigned first_sc = cow_start_to / s->subcluster_size;
+        unsigned last_sc = (cow_end_from - 1) / s->subcluster_size;
          int i;
-        for (i = 0; i < nb_clusters; i++) {
-            l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
-            if (qcow2_get_cluster_type(bs, l2_entry) != QCOW2_CLUSTER_NORMAL) {
+        for (i = first_sc; i <= last_sc; i++) {
+            unsigned c = i / s->subclusters_per_cluster;
+            unsigned sc = i % s->subclusters_per_cluster;
+            l2_entry = get_l2_entry(s, l2_slice, l2_index + c);
+            l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index + c);
+            type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc);
+            if (type == QCOW2_SUBCLUSTER_INVALID) {
+                l2_index += c; /* Point to the invalid entry */
+                goto fail;
+            }
+            if (type != QCOW2_SUBCLUSTER_NORMAL) {
                  break;
              }
          }

This loop is now 32 times slower (for extended L2 entries). Do youreally need to check for an invalid subcluster here, or can we justblindly check that all 32 subclusters are NORMAL, and leave handling ofinvalid clusters for the rest of the function after we failed theexit-early test? For that matter, for all but the first and lastcluster, checking if 32 clusters are NORMAL is a simple 64-bitcomparison rather than 32 iterations of a loop; and even for the firstand last cluster, the _RANGE macros in 14/31 work well to mask out whichbits must be set/cleared. My guess is that optimizing this loop isworthwhile, since overwriting existing data is probably more common thanallocating new data.

-        if (i == nb_clusters) {
-            return;
+        if (i == last_sc + 1) {
+            return 0;
          }
      }

If we get here, then i is now the address of the first subcluster thatwas not NORMAL, even if it is much smaller than the final subclusterlearned by nb_clusters for the overall request. [1]

/* Get the L2 entry of the first cluster */

      l2_entry = get_l2_entry(s, l2_slice, l2_index);
-    type = qcow2_get_cluster_type(bs, l2_entry);
+    l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index);
+    sc_index = offset_to_sc_index(s, guest_offset);
+    type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_index);

- if (type == QCOW2_CLUSTER_NORMAL && keep_old) {

-        cow_start_from = cow_start_to;
+    if (type == QCOW2_SUBCLUSTER_INVALID) {
+        goto fail;
+    }
+
+    if (!keep_old) {
+        switch (type) {
+        case QCOW2_SUBCLUSTER_COMPRESSED:
+            cow_start_from = 0;
+            break;
+        case QCOW2_SUBCLUSTER_NORMAL:
+        case QCOW2_SUBCLUSTER_ZERO_ALLOC:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC: {
+            int i;
+            /* Skip all leading zero and unallocated subclusters */
+            for (i = 0; i < sc_index; i++) {
+                QCow2SubclusterType t;
+                t = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, i);
+                if (t == QCOW2_SUBCLUSTER_INVALID) {
+                    goto fail;
+                } else if (t == QCOW2_SUBCLUSTER_NORMAL) {
+                    break;
+                }
+            }
+            cow_start_from = i << s->subcluster_bits;
+            break;

Note that you are only skipping until the first normal subcluster, evenif other zero/unallocated clusters occur between the first normalcluster and the start of the action. Or visually, suppose we have:


--0-NN-0_NNNNNNNN_NNNNNNNN_NNNNNNNN

as our 32 subclusters, with sc_index of 8. You will end up skippingsubclusters 0-3, but NOT 6 and 7. Still, even though we spend timecopying the allocated contents of those two subclusters, we also copythe subcluster status, and the guest still ends up reading the same dataas before. I don't know if it is worth trying to further minimize I/Oto non-contiguous zero/unalloc subclusters in the head.

+        }
+        case QCOW2_SUBCLUSTER_ZERO_PLAIN:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
+            cow_start_from = sc_index << s->subcluster_bits;
+            break;
+        default:
+            g_assert_not_reached();
+        }
      } else {
-        cow_start_from = 0;
+        switch (type) {
+        case QCOW2_SUBCLUSTER_NORMAL:
+            cow_start_from = cow_start_to;
+            break;
+        case QCOW2_SUBCLUSTER_ZERO_ALLOC:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC:
+            cow_start_from = sc_index << s->subcluster_bits;
+            break;
+        default:
+            g_assert_not_reached();
+        }
      }

/* Get the L2 entry of the last cluster */

-    l2_entry = get_l2_entry(s, l2_slice, l2_index + nb_clusters - 1);
-    type = qcow2_get_cluster_type(bs, l2_entry);
+    l2_index += nb_clusters - 1;
+    l2_entry = get_l2_entry(s, l2_slice, l2_index);
+    l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index);
+    sc_index = offset_to_sc_index(s, guest_offset + bytes - 1);
+    type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_index);

[1] but here, we are skipping any intermediate clusters, and worryingonly about the state of the final cluster. Is that always going to dothe correct thing, or will there be cases where we need to update the L2entries of intermediate clusters? If we don't check specifically forINVALID in the initial subcluster, but instead break the loop as soon aswe find non-NORMAL, then it seems like the rest of the function shouldfragment the overall request to deal with just the clusters up to thepoint where we found a non-NORMAL, and leave the remaining portion ofthe original request for another round.


Thus, I'm not convinced that this patch is quite right.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v5 24/31] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2(), (continued)
- [PATCH v5 24/31] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2(), Alberto Garcia, 2020/05/05
- [PATCH v5 06/31] qcow2: Add get_l2_entry() and set_l2_entry(), Alberto Garcia, 2020/05/05
- [PATCH v5 21/31] qcow2: Add subcluster support to zero_in_l2_slice(), Alberto Garcia, 2020/05/05
- [PATCH v5 27/31] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes(), Alberto Garcia, 2020/05/05
- [PATCH v5 30/31] qcow2: Add subcluster support to qcow2_measure(), Alberto Garcia, 2020/05/05
  - Re: [PATCH v5 30/31] qcow2: Add subcluster support to qcow2_measure(), Eric Blake, 2020/05/06
    - Re: [PATCH v5 30/31] qcow2: Add subcluster support to qcow2_measure(), Alberto Garcia, 2020/05/07
- [PATCH v5 29/31] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters, Alberto Garcia, 2020/05/05
  - Re: [PATCH v5 29/31] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters, Eric Blake, 2020/05/06
- [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta(), Alberto Garcia, 2020/05/05
  - Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta(), Eric Blake <=
    - Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta(), Alberto Garcia, 2020/05/06
    - Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta(), Eric Blake, 2020/05/06
    - Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta(), Alberto Garcia, 2020/05/07
    - Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta(), Alberto Garcia, 2020/05/07
- [PATCH v5 14/31] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type(), Alberto Garcia, 2020/05/05
  - Re: [PATCH v5 14/31] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type(), Eric Blake, 2020/05/05
- [PATCH v5 15/31] qcow2: Add qcow2_cluster_is_allocated(), Alberto Garcia, 2020/05/05
  - Re: [PATCH v5 15/31] qcow2: Add qcow2_cluster_is_allocated(), Eric Blake, 2020/05/05
- [PATCH v5 17/31] qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_*, Alberto Garcia, 2020/05/05
- [PATCH v5 31/31] iotests: Add tests for qcow2 images with extended L2 entries, Alberto Garcia, 2020/05/05

Prev by Date: Re: [PATCH v2 6/6] iotests: Add test 291 to for qemu-img bitmap coverage
Next by Date: Re: backing chain & block status & filters
Previous by thread: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta()
Next by thread: Re: [PATCH v5 19/31] qcow2: Add subcluster support to calculate_l2_meta()
Index(es):
- Date
- Thread