Re: [Qemu-discuss] Failing to get PCI pass-through/muli-seat working wit

qemu-discuss
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-discuss] Failing to get PCI pass-through/muli-seat working wit

From:	Brian Yglesias
Subject:	Re: [Qemu-discuss] Failing to get PCI pass-through/muli-seat working with any multidisk configuration
Date:	Tue, 13 Sep 2016 15:12:07 -0700 (PDT)
I've narrowed the problem down to the use of multiple disks in general. 
Neither guest (or the host) crashes, as long as all virtual disks are on the 
same physical media.

Adding a second disk to one of the VMs, irrespective of the 
partitioning/RAID/volume management, will always trigger a crash in both 
VMs.

As a reminder, the conditions that trigger this are PCI pass through of a 
GPU to at least one VM, and it occurs at least on certain (maybe all) 
LGA1366 motherboards (have not tried dual-socket chipsets.  Is there a 
chance that might help?  I'd really rather not liquidate all this hardware 
and buy more, again.)

I posted a bug report on the matter, but all the information I have about 
the matter is also in this thread.

https://bugs.launchpad.net/qemu/+bug/1619991

Any help is much appreciated, as I don't know what else to do.

-Brian
-----Original Message-----
From: Brian Yglesias [mailto:address@hidden
Sent: Tuesday, August 16, 2016 11:06 PM
To: qemu-discuss <address@hidden>
Subject: Re: Failing to get PCI pass-through/muli-seat working with any 
multidisk configuration

I forgot to mention that I can assign 2 GPU to 1 VM.  The problem is only 
with two concurrent VM with 1 GPU each.

----- Original Message -----
From: "Brian Yglesias" <address@hidden>
To: "qemu-discuss" <address@hidden>
Sent: Monday, August 15, 2016 5:24:36 AM
Subject: Failing to get PCI pass-through/muli-seat working with any 
multidisk configuration

Hello everyone.

It seems the only way I can multi-seat to work is by having the OS and the 
VMs on a single disk, and after weeks of futility I'm starting to wonder if 
I can even replicate that.

I have two VMs which work surprisingly well with VFIO/IOMMU, unless I run 
them concurrently.  If I do, then the display driver will crash on one VM 
followed shortly by the other.  I've replicated this problem with multiple 
kernels from 4.2.1 to 4.7.X, and on two X58/LGA1366 MBs, so I suspect it 
affects most or all of them, at least when used with Debian / Proxmox.

There is nothing in the system logs to indicate why.

Here are the specs on the system I'm currently working on.

Distro:  Debian 8 / Proxmox 4.2
MB:  Asus Rampage III
CPU:  Xeon X5670
RAM:  24 GB

DISK1:  OS - XFS/LVM
DISK2-4:  VMs - ZFS RAIDZ-1

I've also seen the same on a GA-EX58 mb, set up identically.


I've tried ZFS, MDADM with and without LVM, I've tried MDADM raids 5, 1, and 
even 0.

I thought for sure that in the worst case scenario I would be able to assign 
a VM per disk.  Not so.

Oddly, it's actually gotten worse in that before I would need to start 
something 3D on both VMs in order to reliably crash both VMs (within seconds 
of each other usually).  Now all I need to do is start the second one, and 
the display driver will crash on the first one. (The fact that both VMs 
always crash has to be indicative of something, but not sure what.)

I'm pretty much back at the drawing board.  I'm actually starting to doubt 
that my 'single disk test' really worked.  Maybe I just didn't run it long 
enough?  So I will try that again.  Unfortunately, I only have spindle disks 
large enough to hold everything on hand right now, so it won't be an exact 
replica.

Beyond that, I really don't know.  I currently have the system set up in 
almost the most basic way I can to have something acceptable:
-OS on a single 120 GB SSD
-VM Root Pool on 3 240 GB SSD, Raid Z-1

Soft rebooting a VM will always cause that VM's display to get garbled on 
POST.  I don't even have to get into Windows, if that happens I know the VM 
is beyond salvation, and the second one is going down too.

I'm beginning to think this is somehow tied to my X58 chipset mbs (happens 
identically on both a Gigabyte and Asus board with that chipset), or the 
qemu/kvm that comes with Proxmox.  A third possibility may be some 
server-oriented tuning cooked into Proxmox.  (Maybe I'll do single disk this 
time with regular Debian, and see if there is some change.)

Proxmox has a bug which sets HV_Vendor_ID to 'Proxmox' rather than 
'Nvidia43Fix', which causes a Code 43 in the Nvidia Driver (it says in 
device manager:  "has reported a problem and has been stopped", or some 
such).  As a result I launch the VMs from the console based on the tweaked 
output of 'qm showcmd <vmid>':

VM1:

# sed -e 's/#.*$//' -e '/^$/d' /root/src/brian.1 /usr/bin/systemd-run 
\ --scope \ --slice qemu \ --unit 110 \ -p KillMode=none \ -p 
CPUShares=250000 \ /usr/bin/kvm -id 110 \ -chardev 
socket,id=qmp,path=/var/run/qemu-server/110.qmp,server,nowait \ -mon 
chardev=qmp,mode=control \ -pidfile /var/run/qemu-server/110.pid 
\ -daemonize \ -smbios type=1,uuid=6a9ea4a2-48bd-415e-95fb-adf8c9db44f7 
\ -drive if=pflash,format=raw,readonly,file=/usr/share/kvm/OVMF-pure-efi.fd 
\ -drive if=pflash,format=raw,file=/root/sbin/110-OVMF_VARS-pure-efi.fd 
\ -name Brian-PC \ -smp 12,sockets=1,cores=12,maxcpus=12 \ -nodefaults 
\ -boot menu=on,strict=on,reboot-timeout=1000 \ -vga none \ -nographic 
\ -no-hpet \ -cpu 
host,hv_vendor_id=Nvidia43FIX,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_relaxed,+kvm_pv_unhalt,+kvm_pv_eoi,kvm=off
 
\ -m 8192 \ -object memory-backend-ram,size=8192M,id=ram-node0 \ -numa 
node,nodeid=0,cpus=0-11,memdev=ram-node0 \ -k en-us \ -readconfig 
/usr/share/qemu-server/pve-q35.cfg \ -device 
usb-tablet,id=tablet,bus=ehci.0,port=1 \ -device 
vfio-pci,host=04:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0 \ -device 
vfio-pci,host=04:00.1,id=hostpci1,bus=ich9-pcie-port-2,addr=0x0 \ -device 
usb-host,hostbus=1,hostport=6.1 \ -device usb-host,hostbus=1,hostport=6.2.1 
\ -device usb-host,hostbus=1,hostport=6.2.2 \ -device 
usb-host,hostbus=1,hostport=6.2.3 \ -device usb-host,hostbus=1,hostport=6.2 
\ -device usb-host,hostbus=1,hostport=6.3 \ -device 
usb-host,hostbus=1,hostport=6.4 \ -device usb-host,hostbus=1,hostport=6.5 
\ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 \ -drive 
file=/dev/zvol/SSD-pool/vm-110-disk-1,if=none,id=drive-virtio0,cache=writeback,format=raw,aio=threads,detect-zeroes=on
 
\ -device 
virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100 
\ -netdev 
type=tap,id=net0,ifname=tap110i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on
 
\ -device 
virtio-net-pci,mac=32:61:36:63:37:64,netdev=net0,bus=pci.0,addr=0x12,id=net0 
\ -rtc driftfix=slew,base=localtime \ -machine type=q35 \ -global 
kvm-pit.lost_tick_policy=discard

VM2:

# sed -e 's/#.*$//' -e '/^$/d' /root/src/madzia.2 /usr/bin/systemd-run 
\ --scope \ --slice qemu \ --unit 111 \ -p KillMode=none \ -p 
CPUShares=250000 \ /usr/bin/kvm \ -id 111 \ -chardev 
socket,id=qmp,path=/var/run/qemu-server/111.qmp,server,nowait \ -mon 
chardev=qmp,mode=control \ -pidfile /var/run/qemu-server/111.pid 
\ -daemonize \ -smbios type=1,uuid=55d862f4-d9b9-40ab-9b0a-e1eadf874750 
\ -drive if=pflash,format=raw,readonly,file=/usr/share/kvm/OVMF-pure-efi.fd 
\ -drive if=pflash,format=raw,file=/root/sbin/111-OVMF_VARS-pure-efi.fd 
\ -name Madzia-PC \ -smp 12,sockets=1,cores=12,maxcpus=12 \ -nodefaults 
\ -boot menu=on,strict=on,reboot-timeout=1000 \ -vga none \ -nographic 
\ -no-hpet \ -cpu 
host,hv_vendor_id=Nvidia43FIX,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_relaxed,+kvm_pv_unhalt,+kvm_pv_eoi,kvm=off
 
\ -m 8192 \ -object memory-backend-ram,size=8192M,id=ram-node0 \ -numa 
node,nodeid=0,cpus=0-11,memdev=ram-node0 \ -k en-us \ -readconfig 
/usr/share/qemu-server/pve-q35.cfg \ -device 
usb-tablet,id=tablet,bus=ehci.0,port=1 \ -device 
vfio-pci,host=05:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0 \ -device 
vfio-pci,host=05:00.1,id=hostpci1,bus=ich9-pcie-port-2,addr=0x0 \ -device 
usb-host,hostbus=2,hostport=2.1 \ -device usb-host,hostbus=2,hostport=2.2 
\ -device usb-host,hostbus=2,hostport=2.3 \/ -device 
usb-host,hostbus=2,hostport=2.4 \ -device usb-host,hostbus=2,hostport=2.5 
\ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 \ -iscsi 
initiator-name=iqn.1993-08.org.debian:01:1530d013b944 \ -drive 
file=/dev/zvol/SSD-pool/vm-111-disk-1,if=none,id=drive-virtio0,cache=writeback,format=raw,aio=threads,detect-zeroes=on
 
\ -device 
virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100 
\ -netdev 
type=tap,id=net0,ifname=tap111i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on
 
\ -device 
virtio-net-pci,mac=4E:F0:DD:90:DB:2D,netdev=net0,bus=pci.0,addr=0x12,id=net0 
\ -rtc driftfix=slew,base=localtime \ -machine type=q35 \ -global 
kvm-pit.lost_tick_policy=discard

However, I've tried many invocations of KVM without success.

Here is how I load my modules:


# cat /etc/modprobe.d/iommu_unsafe_interrupts.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1



# cat /etc/modprobe.d/vfio_pci.conf
options vfio_pci disable_vga=1
#install vfio_pci /root/sbin/vfio-pci-override-vga.sh
options vfio-pci ids=10de:13c2,10de:0fbb,10de:11c0,10de:0e0b



# cat /etc/modprobe.d/zfs.conf
options zfs zfs_arc_max=4299967296



# cat /etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1


... I believe grub is set up correctly ...


# sed -e 's/#.*$//' -e '/^$/d' /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on 
vfio_iommu_type1.allow_unsafe_interrupts=1 quiet"
GRUB_CMDLINE_LINUX=""
GRUB_DISABLE_OS_PROBER=true
GRUB_DISABLE_RECOVERY="true"


...  I believe I have all the correct modules loaded on boot ...


# sed -e 's/#.*$//' -e '/^$/d' /etc/modules coretemp
it87
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd


... Here's the Q35 config file ...


# sed -e 's/#.*$//' -e '/^$/d' /usr/share/qemu-server/pve-q35.cfg
[device "ehci"]
  driver = "ich9-usb-ehci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.7"
[device "uhci-1"]
  driver = "ich9-usb-uhci1"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.0"
  masterbus = "ehci.0"
  firstport = "0"
[device "uhci-2"]
  driver = "ich9-usb-uhci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.1"
  masterbus = "ehci.0"
  firstport = "2"
[device "uhci-3"]
  driver = "ich9-usb-uhci3"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1d.2"
  masterbus = "ehci.0"
  firstport = "4"
[device "ehci-2"]
  driver = "ich9-usb-ehci2"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.7"
[device "uhci-4"]
  driver = "ich9-usb-uhci4"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.0"
  masterbus = "ehci-2.0"
  firstport = "0"
[device "uhci-5"]
  driver = "ich9-usb-uhci5"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.1"
  masterbus = "ehci-2.0"
  firstport = "2"
[device "uhci-6"]
  driver = "ich9-usb-uhci6"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1a.2"
  masterbus = "ehci-2.0"
  firstport = "4"
[device "audio0"]
  driver = "ich9-intel-hda"
  bus = "pcie.0"
  addr = "1b.0"
[device "ich9-pcie-port-1"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.0"
  port = "1"
  chassis = "1"
[device "ich9-pcie-port-2"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.1"
  port = "2"
  chassis = "2"
[device "ich9-pcie-port-3"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.2"
  port = "3"
  chassis = "3"
[device "ich9-pcie-port-4"]
  driver = "ioh3420"
  multifunction = "on"
  bus = "pcie.0"
  addr = "1c.3"
  port = "4"
  chassis = "4"
[device "pcidmi"]
  driver = "i82801b11-bridge"
  bus = "pcie.0"
  addr = "1e.0"
[device "pci.0"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "1.0"
  chassis_nr = "1"
[device "pci.1"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "2.0"
  chassis_nr = "2"
[device "pci.2"]
  driver = "pci-bridge"
  bus = "pcidmi"
  addr = "3.0"
  chassis_nr = "3"

... and plenty of CPU ...


# cat /proc/cpuinfo | grep -A 5 processor . "\\: 11"
# cat /proc/cpuinfo | grep  -A 4 processor.*": 11"
processor       : 11
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X 000  @ 2.93GHz


If anyone has any suggestions, I would greatly appreciate it.
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [Qemu-discuss] Failing to get PCI pass-through/muli-seat working with any multidisk configuration, Brian Yglesias <=
Prev by Date: Re: [Qemu-discuss] Install multiple forks of QEMU
Next by Date: [Qemu-discuss] How to build latest stable version of QEMU?
Previous by thread: [Qemu-discuss] Inter VM dependency on two socket hypervisor
Next by thread: [Qemu-discuss] How to build latest stable version of QEMU?
Index(es):
- Date
- Thread