Re: [PATCH 4/6] coroutine-lock: reimplement CoRwlock to fix downgrade bu

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 4/6] coroutine-lock: reimplement CoRwlock to fix downgrade bu

From:	Paolo Bonzini
Subject:	Re: [PATCH 4/6] coroutine-lock: reimplement CoRwlock to fix downgrade bug
Date:	Wed, 24 Mar 2021 17:40:23 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0

On 24/03/21 17:15, Stefan Hajnoczi wrote:

On Wed, Mar 17, 2021 at 07:00:11PM +0100, Paolo Bonzini wrote:

+static void qemu_co_rwlock_maybe_wake_one(CoRwlock *lock)
+{
+    CoRwTicket *tkt = QSIMPLEQ_FIRST(&lock->tickets);
+    Coroutine *co = NULL;
+
+    /*
+     * Setting lock->owners here prevents rdlock and wrlock from
+     * sneaking in between unlock and wake.
+     */
+
+    if (tkt) {
+        if (tkt->read) {
+            if (lock->owners >= 0) {
+                lock->owners++;
+                co = tkt->co;
+            }
+        } else {
+            if (lock->owners == 0) {
+                lock->owners = -1;
+                co = tkt->co;
+            }
+        }
+    }
+
+    if (co) {
+        QSIMPLEQ_REMOVE_HEAD(&lock->tickets, next);
+        qemu_co_mutex_unlock(&lock->mutex);
+        aio_co_wake(co);


I find it hard to reason about QSIMPLEQ_EMPTY(&lock->tickets) callers
that execute before co is entered. They see an empty queue even though a
coroutine is about to run. Updating owners above ensures that the code
correctly tracks the state of the rwlock, but I'm not 100% confident
about this aspect of the code.

Good point. The invariant when lock->mutex is released should beclarified; a better way to phrase the comment above "if (tkt)" is:

The invariant when lock->mutex is released is that every ticket istracked in either lock->owners or lock->tickets. By updatinglock->owners here, rdlock/wrlock/upgrade will block even if they executebetween qemu_co_mutex_unlock and aio_co_wake.

-    self->locks_held--;
+        lock->owners--;
+        QSIMPLEQ_INSERT_TAIL(&lock->tickets, &my_ticket, next);
+        qemu_co_rwlock_maybe_wake_one(lock);
+        qemu_coroutine_yield();
+        assert(lock->owners == -1);


lock->owners is read outside lock->mutex here. Not sure if this can
cause problems.

True. It is okay though because lock->owners cannot change until thiscoroutine unlocks. A worse occurrence of the issue is in rdlock:


        assert(lock->owners >= 1);

/* Possibly wake another reader, which will wake the next inline. */

        qemu_co_mutex_lock(&lock->mutex);

where the assert should be moved after taking the lock, or possiblychanged to use qatomic_read. (I prefer the former).

locks_held is kept unchanged across qemu_coroutine_yield() even though
the read lock has been released. rdlock() and wrlock() only increment
locks_held after acquiring the rwlock.

In practice I don't think it matters, but it seems inconsistent. If
locks_held is supposed to track tickets (not just coroutines currently
holding a lock), then rdlock() and wrlock() should increment before
yielding.

locks_held (unlike owners) is not part of the lock, it's part of theCoroutine and only used for debugging (asserting that terminatingcoroutines are not holding any lock).


Paolo

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 3/6] coroutine/mutex: Store the coroutine in the CoWaitRecord only once, (continued)
- [PATCH 3/6] coroutine/mutex: Store the coroutine in the CoWaitRecord only once, Paolo Bonzini, 2021/03/17
- [PATCH 5/6] test-coroutine: add rwlock upgrade test, Paolo Bonzini, 2021/03/17
  - Re: [PATCH 5/6] test-coroutine: add rwlock upgrade test, David Edmondson, 2021/03/17
- [PATCH 2/6] block/vdi: Don't assume that blocks are larger than VdiHeader, Paolo Bonzini, 2021/03/17
  - Re: [PATCH 2/6] block/vdi: Don't assume that blocks are larger than VdiHeader, Max Reitz, 2021/03/24
- [PATCH 1/6] block/vdi: When writing new bmap entry fails, don't leak the buffer, Paolo Bonzini, 2021/03/17
  - Re: [PATCH 1/6] block/vdi: When writing new bmap entry fails, don't leak the buffer, Max Reitz, 2021/03/24
- [PATCH 6/6] test-coroutine: Add rwlock downgrade test, Paolo Bonzini, 2021/03/17
- [PATCH 4/6] coroutine-lock: reimplement CoRwlock to fix downgrade bug, Paolo Bonzini, 2021/03/17
  - Re: [PATCH 4/6] coroutine-lock: reimplement CoRwlock to fix downgrade bug, Stefan Hajnoczi, 2021/03/24
    - Re: [PATCH 4/6] coroutine-lock: reimplement CoRwlock to fix downgrade bug, Paolo Bonzini <=
- Re: [PATCH v5 0/6] coroutine rwlock downgrade fix, minor VDI changes, Max Reitz, 2021/03/24
- Re: [PATCH v5 0/6] coroutine rwlock downgrade fix, minor VDI changes, Stefan Hajnoczi, 2021/03/24
  - Re: [PATCH v5 0/6] coroutine rwlock downgrade fix, minor VDI changes, Paolo Bonzini, 2021/03/24

Prev by Date: Re: [PATCH v3 07/10] Reset the auto-converge counter at every checkpoint.
Next by Date: Re: [PATCH v5 0/6] coroutine rwlock downgrade fix, minor VDI changes
Previous by thread: Re: [PATCH 4/6] coroutine-lock: reimplement CoRwlock to fix downgrade bug
Next by thread: Re: [PATCH v5 0/6] coroutine rwlock downgrade fix, minor VDI changes
Index(es):
- Date
- Thread