qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] a31ca6: qemu/queue.h: clear linked list point


From: Peter Maydell
Subject: [Qemu-commits] [qemu/qemu] a31ca6: qemu/queue.h: clear linked list pointers on remove
Date: Wed, 11 Mar 2020 17:15:12 +0000 (UTC)

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: a31ca6801c027dbee2c589da85814b56eec563f6
      
https://github.com/qemu/qemu/commit/a31ca6801c027dbee2c589da85814b56eec563f6
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2020-03-09 (Mon, 09 Mar 2020)

  Changed paths:
    M include/qemu/queue.h

  Log Message:
  -----------
  qemu/queue.h: clear linked list pointers on remove

Do not leave stale linked list pointers around after removal.  It's
safer to set them to NULL so that use-after-removal results in an
immediate segfault.

The RCU queue removal macros are unchanged since nodes may still be
traversed after removal.

Suggested-by: Paolo Bonzini <address@hidden>
Signed-off-by: Stefan Hajnoczi <address@hidden>
Link: https://lore.kernel.org/r/address@hidden
Message-Id: <address@hidden>


  Commit: c39cbedb54fc49ba41cfe0af36570818025d281e
      
https://github.com/qemu/qemu/commit/c39cbedb54fc49ba41cfe0af36570818025d281e
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2020-03-09 (Mon, 09 Mar 2020)

  Changed paths:
    M util/aio-posix.c

  Log Message:
  -----------
  aio-posix: remove confusing QLIST_SAFE_REMOVE()

QLIST_SAFE_REMOVE() is confusing here because the node must be on the
list.  We actually just wanted to clear the linked list pointers when
removing it from the list.  QLIST_REMOVE() now does this, so switch to
it.

Suggested-by: Paolo Bonzini <address@hidden>
Signed-off-by: Stefan Hajnoczi <address@hidden>
Link: https://lore.kernel.org/r/address@hidden
Message-Id: <address@hidden>


  Commit: e4346192f1c2e1683a807b46efac47ef0cf9b545
      
https://github.com/qemu/qemu/commit/e4346192f1c2e1683a807b46efac47ef0cf9b545
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2020-03-09 (Mon, 09 Mar 2020)

  Changed paths:
    M util/aio-posix.c

  Log Message:
  -----------
  aio-posix: completely stop polling when disabled

One iteration of polling is always performed even when polling is
disabled.  This is done because:
1. Userspace polling is cheaper than making a syscall.  We might get
   lucky.
2. We must poll once more after polling has stopped in case an event
   occurred while stopping polling.

However, there are downsides:
1. Polling becomes a bottleneck when the number of event sources is very
   high.  It's more efficient to monitor fds in that case.
2. A high-frequency polling event source can starve non-polling event
   sources because ppoll(2)/epoll(7) is never invoked.

This patch removes the forced polling iteration so that poll_ns=0 really
means no polling.

IOPS increases from 10k to 60k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device because the large number of event sources being polled slows down
the event loop.

Signed-off-by: Stefan Hajnoczi <address@hidden>
Link: https://lore.kernel.org/r/address@hidden
Message-Id: <address@hidden>


  Commit: 3aa221b382c9b36db1750ef5ed340b6566aacb8c
      
https://github.com/qemu/qemu/commit/3aa221b382c9b36db1750ef5ed340b6566aacb8c
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2020-03-09 (Mon, 09 Mar 2020)

  Changed paths:
    M util/aio-posix.c

  Log Message:
  -----------
  aio-posix: move RCU_READ_LOCK() into run_poll_handlers()

Now that run_poll_handlers_once() is only called by run_poll_handlers()
we can improve the CPU time profile by moving the expensive
RCU_READ_LOCK() out of the polling loop.

This reduces the run_poll_handlers() from 40% CPU to 10% CPU in perf's
sampling profiler output.

Signed-off-by: Stefan Hajnoczi <address@hidden>
Link: https://lore.kernel.org/r/address@hidden
Message-Id: <address@hidden>


  Commit: 1f050a4690f62a1e7dabc4f44141e9f762c3769f
      
https://github.com/qemu/qemu/commit/1f050a4690f62a1e7dabc4f44141e9f762c3769f
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2020-03-09 (Mon, 09 Mar 2020)

  Changed paths:
    M MAINTAINERS
    M include/block/aio.h
    M util/Makefile.objs
    M util/aio-posix.c
    A util/aio-posix.h
    A util/fdmon-epoll.c
    A util/fdmon-poll.c

  Log Message:
  -----------
  aio-posix: extract ppoll(2) and epoll(7) fd monitoring

The ppoll(2) and epoll(7) file descriptor monitoring implementations are
mixed with the core util/aio-posix.c code.  Before adding another
implementation for Linux io_uring, extract out the existing
ones so there is a clear interface and the core code is simpler.

The new interface is AioContext->fdmon_ops, a pointer to a FDMonOps
struct.  See the patch for details.

Semantic changes:
1. ppoll(2) now reflects events from pollfds[] back into AioHandlers
   while we're still on the clock for adaptive polling.  This was
   already happening for epoll(7), so if it's really an issue then we'll
   need to fix both in the future.
2. epoll(7)'s fallback to ppoll(2) while external events are disabled
   was broken when the number of fds exceeded the epoll(7) upgrade
   threshold.  I guess this code path simply wasn't tested and no one
   noticed the bug.  I didn't go out of my way to fix it but the correct
   code is simpler than preserving the bug.

I also took some liberties in removing the unnecessary
AioContext->epoll_available (just check AioContext->epollfd != -1
instead) and AioContext->epoll_enabled (it's implicit if our
AioContext->fdmon_ops callbacks are being invoked) fields.

Signed-off-by: Stefan Hajnoczi <address@hidden>
Link: https://lore.kernel.org/r/address@hidden
Message-Id: <address@hidden>


  Commit: b321051cf48ccc2d3d832af111d688f2282f089b
      
https://github.com/qemu/qemu/commit/b321051cf48ccc2d3d832af111d688f2282f089b
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2020-03-09 (Mon, 09 Mar 2020)

  Changed paths:
    M include/block/aio.h
    M util/aio-posix.c
    M util/fdmon-epoll.c
    M util/fdmon-poll.c

  Log Message:
  -----------
  aio-posix: simplify FDMonOps->update() prototype

The AioHandler *node, bool is_new arguments are more complicated to
think about than simply being given AioHandler *old_node, AioHandler
*new_node.

Furthermore, the new Linux io_uring file descriptor monitoring mechanism
added by the new patch requires access to both the old and the new
nodes.  Make this change now in preparation.

Signed-off-by: Stefan Hajnoczi <address@hidden>
Link: https://lore.kernel.org/r/address@hidden
Message-Id: <address@hidden>


  Commit: 73fd282e7b6dd4e4ea1c3bbb3d302c8db51e4ccf
      
https://github.com/qemu/qemu/commit/73fd282e7b6dd4e4ea1c3bbb3d302c8db51e4ccf
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2020-03-09 (Mon, 09 Mar 2020)

  Changed paths:
    M configure
    M include/block/aio.h
    M util/Makefile.objs
    M util/aio-posix.c
    M util/aio-posix.h
    A util/fdmon-io_uring.c

  Log Message:
  -----------
  aio-posix: add io_uring fd monitoring implementation

The recent Linux io_uring API has several advantages over ppoll(2) and
epoll(2).  Details are given in the source code.

Add an io_uring implementation and make it the default on Linux.
Performance is the same as with epoll(7) but later patches add
optimizations that take advantage of io_uring.

It is necessary to change how aio_set_fd_handler() deals with deleting
AioHandlers since removing monitored file descriptors is asynchronous in
io_uring.  fdmon_io_uring_remove() marks the AioHandler deleted and
aio_set_fd_handler() will let it handle deletion in that case.

Signed-off-by: Stefan Hajnoczi <address@hidden>
Link: https://lore.kernel.org/r/address@hidden
Message-Id: <address@hidden>


  Commit: aa38e19f05c3a5ae64dff84f44e1aa31281a5b14
      
https://github.com/qemu/qemu/commit/aa38e19f05c3a5ae64dff84f44e1aa31281a5b14
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2020-03-09 (Mon, 09 Mar 2020)

  Changed paths:
    M include/block/aio.h
    M util/aio-posix.c
    M util/fdmon-epoll.c
    M util/fdmon-io_uring.c
    M util/fdmon-poll.c

  Log Message:
  -----------
  aio-posix: support userspace polling of fd monitoring

Unlike ppoll(2) and epoll(7), Linux io_uring completions can be polled
from userspace.  Previously userspace polling was only allowed when all
AioHandler's had an ->io_poll() callback.  This prevented starvation of
fds by userspace pollable handlers.

Add the FDMonOps->need_wait() callback that enables userspace polling
even when some AioHandlers lack ->io_poll().

For example, it's now possible to do userspace polling when a TCP/IP
socket is monitored thanks to Linux io_uring.

Signed-off-by: Stefan Hajnoczi <address@hidden>
Link: https://lore.kernel.org/r/address@hidden
Message-Id: <address@hidden>


  Commit: d37d0e365afb6825a90d8356fc6adcc1f58f40f3
      
https://github.com/qemu/qemu/commit/d37d0e365afb6825a90d8356fc6adcc1f58f40f3
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2020-03-09 (Mon, 09 Mar 2020)

  Changed paths:
    M include/block/aio.h
    M util/aio-posix.c
    M util/aio-posix.h
    M util/trace-events

  Log Message:
  -----------
  aio-posix: remove idle poll handlers to improve scalability

When there are many poll handlers it's likely that some of them are idle
most of the time.  Remove handlers that haven't had activity recently so
that the polling loop scales better for guests with a large number of
devices.

This feature only takes effect for the Linux io_uring fd monitoring
implementation because it is capable of combining fd monitoring with
userspace polling.  The other implementations can't do that and risk
starving fds in favor of poll handlers, so don't try this optimization
when they are in use.

IOPS improves from 10k to 105k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device for rw=randread,iodepth=1,bs=4k,ioengine=libaio on NVMe.

[Clarified aio_poll_handlers locking discipline explanation in comment
after discussion with Paolo Bonzini <address@hidden>.
--Stefan]

Signed-off-by: Stefan Hajnoczi <address@hidden>
Link: https://lore.kernel.org/r/address@hidden
Message-Id: <address@hidden>


  Commit: 6e8a73e911f066527e775e04b98f31ebd19db600
      
https://github.com/qemu/qemu/commit/6e8a73e911f066527e775e04b98f31ebd19db600
  Author: Peter Maydell <address@hidden>
  Date:   2020-03-11 (Wed, 11 Mar 2020)

  Changed paths:
    M MAINTAINERS
    M configure
    M include/block/aio.h
    M include/qemu/queue.h
    M util/Makefile.objs
    M util/aio-posix.c
    A util/aio-posix.h
    A util/fdmon-epoll.c
    A util/fdmon-io_uring.c
    A util/fdmon-poll.c
    M util/trace-events

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into 
staging

Pull request

# gpg: Signature made Wed 11 Mar 2020 12:40:36 GMT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <address@hidden>" [full]
# gpg:                 aka "Stefan Hajnoczi <address@hidden>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  aio-posix: remove idle poll handlers to improve scalability
  aio-posix: support userspace polling of fd monitoring
  aio-posix: add io_uring fd monitoring implementation
  aio-posix: simplify FDMonOps->update() prototype
  aio-posix: extract ppoll(2) and epoll(7) fd monitoring
  aio-posix: move RCU_READ_LOCK() into run_poll_handlers()
  aio-posix: completely stop polling when disabled
  aio-posix: remove confusing QLIST_SAFE_REMOVE()
  qemu/queue.h: clear linked list pointers on remove

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/ba29883206d9...6e8a73e911f0



reply via email to

[Prev in Thread] Current Thread [Next in Thread]