[Qemu-commits] [qemu/qemu] b660a8: job: take each job's lock individuall

qemu-commits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] b660a8: job: take each job's lock individuall

From:	Peter Maydell
Subject:	[Qemu-commits] [qemu/qemu] b660a8: job: take each job's lock individually in job_txn_...
Date:	Tue, 07 Apr 2020 13:00:16 -0700

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: b660a84bbb0eb1a76b505648d31d5e82594fb75e
      
https://github.com/qemu/qemu/commit/b660a84bbb0eb1a76b505648d31d5e82594fb75e
  Author: Stefan Reiter <address@hidden>
  Date:   2020-04-07 (Tue, 07 Apr 2020)

  Changed paths:
    M blockdev.c
    M job-qmp.c
    M job.c
    M tests/test-blockjob.c

  Log Message:
  -----------
  job: take each job's lock individually in job_txn_apply

All callers of job_txn_apply hold a single job's lock, but different
jobs within a transaction can have different contexts, thus we need to
lock each one individually before applying the callback function.

Similar to job_completed_txn_abort this also requires releasing the
caller's context before and reacquiring it after to avoid recursive
locks which might break AIO_WAIT_WHILE in the callback. This is safe, since
existing code would already have to take this into account, lest
job_completed_txn_abort might have broken.

This also brings to light a different issue: When a callback function in
job_txn_apply moves it's job to a different AIO context, callers will
try to release the wrong lock (now that we re-acquire the lock
correctly, previously it would just continue with the old lock, leaving
the job unlocked for the rest of the return path). Fix this by not caching
the job's context.

This is only necessary for qmp_block_job_finalize, qmp_job_finalize and
job_exit, since everyone else calls through job_exit.

One test needed adapting, since it calls job_finalize directly, so it
manually needs to acquire the correct context.

Signed-off-by: Stefan Reiter <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>


  Commit: 08558e33257ec796594bd411261028a93414a70c
      
https://github.com/qemu/qemu/commit/08558e33257ec796594bd411261028a93414a70c
  Author: Stefan Reiter <address@hidden>
  Date:   2020-04-07 (Tue, 07 Apr 2020)

  Changed paths:
    M block/replication.c

  Log Message:
  -----------
  replication: assert we own context before job_cancel_sync

job_cancel_sync requires the job's lock to be held, all other callers
already do this (replication_stop, drive_backup_abort,
blockdev_backup_abort, job_cancel_sync_all, cancel_common).

In this case we're in a BlockDriver handler, so we already have a lock,
just assert that it is the same as the one used for the commit_job.

Signed-off-by: Stefan Reiter <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>


  Commit: eca0f3524a4eb57d03a56b0cbcef5527a0981ce4
      
https://github.com/qemu/qemu/commit/eca0f3524a4eb57d03a56b0cbcef5527a0981ce4
  Author: Stefan Reiter <address@hidden>
  Date:   2020-04-07 (Tue, 07 Apr 2020)

  Changed paths:
    M block/backup.c

  Log Message:
  -----------
  backup: don't acquire aio_context in backup_clean

All code-paths leading to backup_clean (via job_clean) have the job's
context already acquired. The job's context is guaranteed to be the same
as the one used by backup_top via backup_job_create.

Since the previous logic effectively acquired the lock twice, this
broke cleanup of backups for disks using IO threads, since the BDRV_POLL_WHILE
in bdrv_backup_top_drop -> bdrv_do_drained_begin would only release the lock
once, thus deadlocking with the IO thread.

This is a partial revert of 0abf2581717a19.

Signed-off-by: Stefan Reiter <address@hidden>
Reviewed-by: Max Reitz <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>


  Commit: 564806c529d4e0acad209b1e5b864a8886092f1f
      
https://github.com/qemu/qemu/commit/564806c529d4e0acad209b1e5b864a8886092f1f
  Author: Kevin Wolf <address@hidden>
  Date:   2020-04-07 (Tue, 07 Apr 2020)

  Changed paths:
    M block/block-backend.c

  Log Message:
  -----------
  block-backend: Reorder flush/pdiscard function definitions

Move all variants of the flush/pdiscard functions to a single place and
put the blk_co_*() version first because it is called by all other
variants (and will become static in the next patch).

Signed-off-by: Kevin Wolf <address@hidden>
Reviewed-by: Vladimir Sementsov-Ogievskiy <address@hidden>
Reviewed-by: Max Reitz <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>


  Commit: fbb92b6798894d3bf62fe3578d99fa62c720b242
      
https://github.com/qemu/qemu/commit/fbb92b6798894d3bf62fe3578d99fa62c720b242
  Author: Kevin Wolf <address@hidden>
  Date:   2020-04-07 (Tue, 07 Apr 2020)

  Changed paths:
    M block/block-backend.c
    M include/sysemu/block-backend.h

  Log Message:
  -----------
  block: Increase BB.in_flight for coroutine and sync interfaces

External callers of blk_co_*() and of the synchronous blk_*() functions
don't currently increase the BlockBackend.in_flight counter, but calls
from blk_aio_*() do, so there is an inconsistency whether the counter
has been increased or not.

This patch moves the actual operations to static functions that can
later know they will always be called with in_flight increased exactly
once, even for external callers using the blk_co_*() coroutine
interfaces.

If the public blk_co_*() interface is unused, remove it.

Signed-off-by: Kevin Wolf <address@hidden>
Message-Id: <address@hidden>
Reviewed-by: Max Reitz <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>


  Commit: 7f16476fab14fc32388e0ebae793f64673848efa
      
https://github.com/qemu/qemu/commit/7f16476fab14fc32388e0ebae793f64673848efa
  Author: Kevin Wolf <address@hidden>
  Date:   2020-04-07 (Tue, 07 Apr 2020)

  Changed paths:
    M block/block-backend.c

  Log Message:
  -----------
  block: Fix blk->in_flight during blk_wait_while_drained()

Waiting in blk_wait_while_drained() while blk->in_flight is increased
for the current request is wrong because it will cause the drain
operation to deadlock.

This patch makes sure that blk_wait_while_drained() is called with
blk->in_flight increased exactly once for the current request, and that
it temporarily decreases the counter while it waits.

Fixes: cf3129323f900ef5ddbccbe86e4fa801e88c566e
Signed-off-by: Kevin Wolf <address@hidden>
Reviewed-by: Vladimir Sementsov-Ogievskiy <address@hidden>
Reviewed-by: Max Reitz <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>


  Commit: 3f6de653b946fe849330208becf79d6af7e876cb
      
https://github.com/qemu/qemu/commit/3f6de653b946fe849330208becf79d6af7e876cb
  Author: Kevin Wolf <address@hidden>
  Date:   2020-04-07 (Tue, 07 Apr 2020)

  Changed paths:
    M block/vpc.c

  Log Message:
  -----------
  vpc: Don't round up already aligned BAT sizes

As reported on Launchpad, Azure apparently doesn't accept images for
upload that are not both aligned to 1 MB blocks and have a BAT size that
matches the image size exactly.

As far as I can tell, there is no real reason why we create a BAT that
is one entry longer than necessary for aligned image sizes, so change
that.

(Even though the condition is only mentioned as "should" in the spec and
previous products accepted larger BATs - but we'll try to maintain
compatibility with as many of Microsoft's ever-changing interpretations
of the VHD spec as possible.)

Fixes: https://bugs.launchpad.net/bugs/1870098
Reported-by: Tobias Witek
Signed-off-by: Kevin Wolf <address@hidden>
Message-Id: <address@hidden>
Reviewed-by: Max Reitz <address@hidden>
Reviewed-by: Philippe Mathieu-Daudé <address@hidden>
Signed-off-by: Kevin Wolf <address@hidden>


  Commit: 2f37b0222cf9274d014fcb1f211b14ee626561c9
      
https://github.com/qemu/qemu/commit/2f37b0222cf9274d014fcb1f211b14ee626561c9
  Author: Peter Maydell <address@hidden>
  Date:   2020-04-07 (Tue, 07 Apr 2020)

  Changed paths:
    M block/backup.c
    M block/block-backend.c
    M block/replication.c
    M block/vpc.c
    M blockdev.c
    M include/sysemu/block-backend.h
    M job-qmp.c
    M job.c
    M tests/test-blockjob.c

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- Fix crashes and hangs related to iothreads, bdrv_drain and block jobs:
    - Fix some AIO context locking in jobs
    - Fix blk->in_flight during blk_wait_while_drained()
- vpc: Don't round up already aligned BAT sizes

# gpg: Signature made Tue 07 Apr 2020 15:25:24 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <address@hidden>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  vpc: Don't round up already aligned BAT sizes
  block: Fix blk->in_flight during blk_wait_while_drained()
  block: Increase BB.in_flight for coroutine and sync interfaces
  block-backend: Reorder flush/pdiscard function definitions
  backup: don't acquire aio_context in backup_clean
  replication: assert we own context before job_cancel_sync
  job: take each job's lock individually in job_txn_apply

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/339205e7ef37...2f37b0222cf9

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-commits] [qemu/qemu] b660a8: job: take each job's lock individually in job_txn_..., Peter Maydell <=

Prev by Date: [Qemu-commits] [qemu/qemu] 80f5c0: qcow2: Forbid discard in qcow2 v2 images with back...
Next by Date: [Qemu-commits] [qemu/qemu] 1aef27: hw/usb/xen-usb.c: Pass struct usbback_req* to usbb...
Previous by thread: [Qemu-commits] [qemu/qemu] 80f5c0: qcow2: Forbid discard in qcow2 v2 images with back...
Next by thread: [Qemu-commits] [qemu/qemu] 1aef27: hw/usb/xen-usb.c: Pass struct usbback_req* to usbb...
Index(es):
- Date
- Thread