qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] 21e00f: memory: Replace skip_dump flag with "


From: GitHub
Subject: [Qemu-commits] [qemu/qemu] 21e00f: memory: Replace skip_dump flag with "ram_device"
Date: Mon, 31 Oct 2016 12:30:04 -0700

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 21e00fa55f3fdfcbb20da7c6876c91ef3609b387
      
https://github.com/qemu/qemu/commit/21e00fa55f3fdfcbb20da7c6876c91ef3609b387
  Author: Alex Williamson <address@hidden>
  Date:   2016-10-31 (Mon, 31 Oct 2016)

  Changed paths:
    M hw/vfio/common.c
    M hw/vfio/spapr.c
    M include/exec/memory.h
    M memory.c
    M memory_mapping.c

  Log Message:
  -----------
  memory: Replace skip_dump flag with "ram_device"

Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that.  If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it.  Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.

Signed-off-by: Alex Williamson <address@hidden>
Acked-by: Paolo Bonzini <address@hidden>


  Commit: 4a2e242bbb306ef5c16ce9e7bb2da3bd8a4eb098
      
https://github.com/qemu/qemu/commit/4a2e242bbb306ef5c16ce9e7bb2da3bd8a4eb098
  Author: Alex Williamson <address@hidden>
  Date:   2016-10-31 (Mon, 31 Oct 2016)

  Changed paths:
    M include/exec/memory.h
    M memory.c
    M trace-events

  Log Message:
  -----------
  memory: Don't use memcpy for ram_device regions

With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors.  If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap.  Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion.  When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.

This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU.  If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap.  Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems.  I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces.  It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.

To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.

With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access.  The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities.  If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access.  A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device.  Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).

Reported-by: Thorsten Kohfeldt <address@hidden>
Signed-off-by: Alex Williamson <address@hidden>
Acked-by: Paolo Bonzini <address@hidden>


  Commit: 24acf72b9a291cebfd05f2ecdf3a982ac01e6291
      
https://github.com/qemu/qemu/commit/24acf72b9a291cebfd05f2ecdf3a982ac01e6291
  Author: Alex Williamson <address@hidden>
  Date:   2016-10-31 (Mon, 31 Oct 2016)

  Changed paths:
    M hw/vfio/common.c

  Log Message:
  -----------
  vfio: Handle zero-length sparse mmap ranges

As reported in the link below, user has a PCI device with a 4KB BAR
which contains the MSI-X table.  This seems to hit a corner case in
the kernel where the region reports being mmap capable, but the sparse
mmap information reports a zero sized range.  It's not entirely clear
that the kernel is incorrect in doing this, but regardless, we need
to handle it.  To do this, fill our mmap array only with non-zero
sized sparse mmap entries and add an error return from the function
so we can tell the difference between nr_mmaps being zero based on
sparse mmap info vs lack of sparse mmap info.

NB, this doesn't actually change the behavior of the device, it only
removes the scary "Failed to mmap ... Performance may be slow" error
message.  We cannot currently create an mmap over the MSI-X table.

Link: http://lists.nongnu.org/archive/html/qemu-discuss/2016-10/msg00009.html
Signed-off-by: Alex Williamson <address@hidden>


  Commit: a52a4c471703e995ceb06f6157d70747823e8a0d
      
https://github.com/qemu/qemu/commit/a52a4c471703e995ceb06f6157d70747823e8a0d
  Author: Ido Yariv <address@hidden>
  Date:   2016-10-31 (Mon, 31 Oct 2016)

  Changed paths:
    M hw/vfio/pci.c

  Log Message:
  -----------
  vfio/pci: fix out-of-sync BAR information on reset

When a PCI device is reset, pci_do_device_reset resets all BAR addresses
in the relevant PCIDevice's config buffer.

The VFIO configuration space stays untouched, so the guest OS may choose
to skip restoring the BAR addresses as they would seem intact. The PCI
device may be left non-operational.
One example of such a scenario is when the guest exits S3.

Fix this by resetting the BAR addresses in the VFIO configuration space
as well.

Signed-off-by: Ido Yariv <address@hidden>
Signed-off-by: Alex Williamson <address@hidden>


  Commit: 95251725e335af2b885e2ab33dd29c86f8084663
      
https://github.com/qemu/qemu/commit/95251725e335af2b885e2ab33dd29c86f8084663
  Author: Yongji Xie <address@hidden>
  Date:   2016-10-31 (Mon, 31 Oct 2016)

  Changed paths:
    M hw/vfio/common.c
    M hw/vfio/pci.c

  Log Message:
  -----------
  vfio: Add support for mmapping sub-page MMIO BARs

Now the kernel commit 05f0c03fbac1 ("vfio-pci: Allow to mmap
sub-page MMIO BARs if the mmio page is exclusive") allows VFIO
to mmap sub-page BARs. This is the corresponding QEMU patch.
With those patches applied, we could passthrough sub-page BARs
to guest, which can help to improve IO performance for some devices.

In this patch, we expand MemoryRegions of these sub-page
MMIO BARs to PAGE_SIZE in vfio_pci_write_config(), so that
the BARs could be passed to KVM ioctl KVM_SET_USER_MEMORY_REGION
with a valid size. The expanding size will be recovered when
the base address of sub-page BAR is changed and not page aligned
any more in guest. And we also set the priority of these BARs'
memory regions to zero in case of overlap with BARs which share
the same page with sub-page BARs in guest.

Signed-off-by: Yongji Xie <address@hidden>
Signed-off-by: Alex Williamson <address@hidden>


  Commit: e80b4b8fb6babce7dcc91ea9ddeecbc351fd4646
      
https://github.com/qemu/qemu/commit/e80b4b8fb6babce7dcc91ea9ddeecbc351fd4646
  Author: Peter Maydell <address@hidden>
  Date:   2016-10-31 (Mon, 31 Oct 2016)

  Changed paths:
    M hw/vfio/common.c
    M hw/vfio/pci.c
    M hw/vfio/spapr.c
    M include/exec/memory.h
    M memory.c
    M memory_mapping.c
    M trace-events

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20161031.0' 
into staging

VFIO updates 2016-10-31

 - Replace skip_dump with ram_device to denote device memory and mark
   as non-direct to avoid memcpy to MMIO - fixes RTL (Alex Williamson)
 - Skip zero-length sparse mmaps - avoids unnecessary warning
   (Alex Williamson)
 - Clear BARs on reset so guest doesn't assume programming on return
   from S3 (Ido Yariv)
 - Enable sub-page MMIO mmaps - performance improvement for devices
   with smaller BARs, iff both host and guest map them to full,
   aligned pages (Yongji Xie)

# gpg: Signature made Mon 31 Oct 2016 17:26:47 GMT
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <address@hidden>"
# gpg:                 aka "Alex Williamson <address@hidden>"
# gpg:                 aka "Alex Williamson <address@hidden>"
# gpg:                 aka "Alex Williamson <address@hidden>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20161031.0:
  vfio: Add support for mmapping sub-page MMIO BARs
  vfio/pci: fix out-of-sync BAR information on reset
  vfio: Handle zero-length sparse mmap ranges
  memory: Don't use memcpy for ram_device regions
  memory: Replace skip_dump flag with "ram_device"

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/8ff7fd8a29e6...e80b4b8fb6ba

reply via email to

[Prev in Thread] Current Thread [Next in Thread]