qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 05/31] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in


From: Eric Blake
Subject: Re: [PATCH v5 05/31] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied()
Date: Tue, 5 May 2020 14:23:49 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0

On 5/5/20 12:38 PM, Alberto Garcia wrote:
When writing to a qcow2 file there are two functions that take a
virtual offset and return a host offset, possibly allocating new
clusters if necessary:

    - handle_copied() looks for normal data clusters that are already
      allocated and have a reference count of 1. In those clusters we
      can simply write the data and there is no need to perform any
      copy-on-write.

    - handle_alloc() looks for clusters that do need copy-on-write,
      either because they haven't been allocated yet, because their
      reference count is != 1 or because they are ZERO_ALLOC clusters.

The ZERO_ALLOC case is a bit special because those are clusters that
are already allocated and they could perfectly be dealt with in
handle_copied() (as long as copy-on-write is performed when required).

In fact, there is extra code specifically for them in handle_alloc()
that tries to reuse the existing allocation if possible and frees them
otherwise.

This patch changes the handling of ZERO_ALLOC clusters so the
semantics of these two functions are now like this:

    - handle_copied() looks for clusters that are already allocated and
      which we can overwrite (NORMAL and ZERO_ALLOC clusters with a
      reference count of 1).

    - handle_alloc() looks for clusters for which we need a new
      allocation (all other cases).

One important difference after this change is that clusters found
in handle_copied() may now require copy-on-write, but this will be
necessary anyway once we add support for subclusters.

Signed-off-by: Alberto Garcia <address@hidden>
---
  block/qcow2-cluster.c | 256 +++++++++++++++++++++++-------------------
  1 file changed, 141 insertions(+), 115 deletions(-)

@@ -1053,15 +1058,53 @@ void qcow2_alloc_cluster_abort(BlockDriverState *bs, 
QCowL2Meta *m)
  static void calculate_l2_meta(BlockDriverState *bs,
                                uint64_t host_cluster_offset,
                                uint64_t guest_offset, unsigned bytes,
-                              QCowL2Meta **m, bool keep_old)
+                              uint64_t *l2_slice, QCowL2Meta **m, bool 
keep_old)
  {

Borderline long line, but it fits ;)

      BDRVQcow2State *s = bs->opaque;
-    unsigned cow_start_from = 0;
+    int l2_index = offset_to_l2_slice_index(s, guest_offset);
+    uint64_t l2_entry;
+    unsigned cow_start_from, cow_end_to;
      unsigned cow_start_to = offset_into_cluster(s, guest_offset);
      unsigned cow_end_from = cow_start_to + bytes;
-    unsigned cow_end_to = ROUND_UP(cow_end_from, s->cluster_size);
      unsigned nb_clusters = size_to_clusters(s, cow_end_from);
      QCowL2Meta *old_m = *m;
+    QCow2ClusterType type;
+
+    assert(nb_clusters <= s->l2_slice_size - l2_index);
+
+    /* Return if there's no COW (all clusters are normal and we keep them) */
+    if (keep_old) {
+        int i;
+        for (i = 0; i < nb_clusters; i++) {
+            l2_entry = be64_to_cpu(l2_slice[l2_index + i]);
+            if (qcow2_get_cluster_type(bs, l2_entry) != QCOW2_CLUSTER_NORMAL) {
+                break;
+            }
+        }
+        if (i == nb_clusters) {
+            return;
+        }
+    }
+
+    /* Get the L2 entry of the first cluster */
+    l2_entry = be64_to_cpu(l2_slice[l2_index]);

This is the second time we're grabbing the first entry in this function. But I don't think it's worth trying to micro-optimize.


+static int count_single_write_clusters(BlockDriverState *bs, int nb_clusters,
+                                       uint64_t *l2_slice, int l2_index,
+                                       bool new_alloc)
  {
+    BDRVQcow2State *s = bs->opaque;
+    uint64_t l2_entry = be64_to_cpu(l2_slice[l2_index]);
+    uint64_t expected_offset = l2_entry & L2E_OFFSET_MASK;
      int i;
for (i = 0; i < nb_clusters; i++) {
-        uint64_t l2_entry = be64_to_cpu(l2_slice[l2_index + i]);
-        if (!cluster_needs_cow(bs, l2_entry)) {
+        l2_entry = be64_to_cpu(l2_slice[l2_index + i]);

And another place where we compute l2_entry for the first cluster twice, and again not worth micro-optimizing.

I didn't find anything that needs a change.

Reviewed-by: Eric Blake <address@hidden>

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]