qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH v2 4/7] block-copy: add a CoMutex to the BlockCopyTask list
Date: Tue, 25 May 2021 13:25:28 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.2

25.05.2021 13:07, Emanuele Giuseppe Esposito wrote:


On 20/05/2021 17:19, Vladimir Sementsov-Ogievskiy wrote:
18.05.2021 13:07, Emanuele Giuseppe Esposito wrote:
Because the list of tasks is only modified by coroutine
functions, add a CoMutex in order to protect them.

Use the same mutex to protect also BlockCopyState in_flight_bytes
field to avoid adding additional syncronization primitives.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
  block/block-copy.c | 55 +++++++++++++++++++++++++++++-----------------
  1 file changed, 35 insertions(+), 20 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 2e610b4142..3a949fab64 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -83,7 +83,7 @@ typedef struct BlockCopyTask {
       */
      bool zeroes;
-    /* State */
+    /* State. Protected by tasks_lock */
      CoQueue wait_queue; /* coroutines blocked on this task */
      /* To reference all call states from BlockCopyState */
@@ -106,8 +106,9 @@ typedef struct BlockCopyState {
      BdrvChild *target;
      /* State */
-    int64_t in_flight_bytes;
+    int64_t in_flight_bytes; /* protected by tasks_lock */

only this field is protected? or the whole State section?

I guess you figured it already, here there is only in_flight_bytes because in 
next patches I take care of the others.

Honestly, I don't like this way of introducing thread-safety. As I show below, 
we should protect not only structures, but also the logic working with these 
structures. And simple protecting access to some variable doesn't make much 
sense. Critical sections should be wide enough to protect the logic. So I'd 
prefer introducing all the critical sections together with the mutex in one 
patch, to have the whole picture.


[...]

  }
@@ -213,11 +219,7 @@ static coroutine_fn BlockCopyTask 
*block_copy_task_create(BlockCopyState *s,
      assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
      bytes = QEMU_ALIGN_UP(bytes, s->cluster_size);

Looking at block_copy_task_create():

First, !bdrv_dirty_bitmap_next_dirty_area() doesn't take bitmaps lock, so it's 
not protected at all.

Next, even if we take bitmaps lock in bdrv_dirty_bitmap_next_dirty_area() or 
around it, it doesn't bring thread-safety to block_copy_task_create():

The simplest solution here seems to protect bdrv_dirty_bitmap_next_dirty_area 
and also bdrv_reset_dirty_bitmap with the tasks lock, so that once it is 
released the bitmap is updated accordingly and from my understanding no other 
task can get that region.


Yes, we just need to protect larger areas by outer lock, to protect the logic, 
not only structures itself.

Btw, out of curiosity, why is bdrv_dirty_bitmap_next_dirty_area not protected 
by a lock? Can't we have a case where two threads (like you just mention below) 
check the bitmap? Or am I missing something?

Hmm, don't know) Probably it's a bug, may be not. We need to check its callers.



Imagine block_copy_task_create() is called from two threads simultaneously. Both calls 
will get same dirty area and create equal tasks. Then, 
"assert(!find_conflicting_task_locked(s, offset, bytes));" will crash.




-    /* region is dirty, so no existent tasks possible in it */
-    assert(!find_conflicting_task(s, offset, bytes));
-
      bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
-    s->in_flight_bytes += bytes;
      task = g_new(BlockCopyTask, 1);
      *task = (BlockCopyTask) {
@@ -228,7 +230,13 @@ static coroutine_fn BlockCopyTask 
*block_copy_task_create(BlockCopyState *s,
          .bytes = bytes,
      };
      qemu_co_queue_init(&task->wait_queue);
-    QLIST_INSERT_HEAD(&s->tasks, task, list);
+
+    WITH_QEMU_LOCK_GUARD(&s->tasks_lock) {
+        s->in_flight_bytes += bytes;
+        /* region is dirty, so no existent tasks possible in it */
+        assert(!find_conflicting_task_locked(s, offset, bytes));
+        QLIST_INSERT_HEAD(&s->tasks, task, list);
+    }
      return task;
  }
@@ -249,25 +257,29 @@ static void coroutine_fn 
block_copy_task_shrink(BlockCopyTask *task,
      assert(new_bytes > 0 && new_bytes < task->bytes);
-    task->s->in_flight_bytes -= task->bytes - new_bytes;
      bdrv_set_dirty_bitmap(task->s->copy_bitmap,
                            task->offset + new_bytes, task->bytes - new_bytes);

Then look here. Imagine, immediately after bdrv_set_dirty_bitmap() in parallel 
thread the new task is created, which consumes these new dirty bits. Again, it 
will find conflicting task (this one, as task->bytes is not modified yet) and 
crash.

Also here, the lock guard should be enlarged to cover also the dirty bitmap. 
Thank you for pointing these cases!


Yes. And then we'll come to most of the logic covered by mutex, and we'll not 
need atomic operations. And nothing bad in doing simple operations with 
variables under mutex, it's a lot simpler than bothering with atomic operations.



-    task->bytes = new_bytes;
-    qemu_co_queue_restart_all(&task->wait_queue);
+    WITH_QEMU_LOCK_GUARD(&task->s->tasks_lock) {
+        task->s->in_flight_bytes -= task->bytes - new_bytes;
+        task->bytes = new_bytes;
+        qemu_co_queue_restart_all(&task->wait_queue);
+    }
  }



--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]