qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] qemu-system-ppc64 hanging occasionally in disk writes


From: Alexander Graf
Subject: Re: [Qemu-ppc] qemu-system-ppc64 hanging occasionally in disk writes
Date: Thu, 14 Jun 2012 23:34:37 +0200


On 14.06.2012, at 19:13, "Richard W.M. Jones" <address@hidden> wrote:

> On Thu, Jun 14, 2012 at 05:58:04PM +0200, Alexander Graf wrote:
>> [CC'ing qemu-ppc]
>> 
>> On 06/14/2012 05:52 PM, Richard W.M. Jones wrote:
>>> I found last week that qemu-system-ppc64 (from git) hangs occasionally
>>> under load, and I have a reproducer for it now.  Unfortunately the
>>> reproducer really takes a long time to run -- usually I can get a hang
>>> in under 12 hours.
>>> 
>>> Here is the reproducer case:
>>> 
>>>  https://lists.fedoraproject.org/pipermail/ppc/2012-June/001698.html
>>> 
>>> Notes:
>>> 
>>> (1) Verified by one other person (other than me).  Happens on both
>>>    ppc64 and x86-64 host.
>>> 
>>> (2) Happens with both Fedora guest kernel 3.3.4-5.fc17.ppc64 and kernel
>>>    3.5.0 that I compiled myself.  The test case above contains 3.3.4-5.
>>> 
>>> (3) Seems to be a problem in qemu, not the guest.  The reason I think
>>>    this is because I tried to capture a backtrace of the hang using
>>>    remote gdb, but gdb just hung when trying to connect to qemu
>>>    (gdb connects fine before the bug happens).
>>> 
>>> (4) Judging by guest messages, appears to be happening when writing
>>>    to the disk.
>> 
>> Can you please try to see if you can repdudice this using vscsi /
>> vio instead of virtio? I couldn't quite see why vio would be any
>> more stable than virtio though ...
> 
> I just tried virtio-scsi, but only the first disk shows up.  I added
> two disks.  See below for detailed logs.  This works fine on x86-64.
> Should I file a separate bug for this?
> 
>> Also, could you please try and see if it works reliably using KVM?
>> Maybe we're just encountering some TCG breakage here.
> 
> I will try this, but as discussed on IRC last week there's some
> problem with the Fedora host kernel where /dev/kvm doesn't show up,
> even though the kernel is supposedly compiled with KVM PR enabled.  So
> I need to fix that first.
> 
> Rich.
> 
> virtio scsi on ppc64
> --------------------
> 
> qemu command line:
> 
> /home/rjones/d/qemu/ppc64-softmmu/qemu-system-ppc64 \
>    -global virtio-blk-pci.scsi=off \
>    -nodefconfig \
>    -nodefaults \
>    -nographic \
>    -device virtio-scsi-pci,id=scsi \
>    -drive file=test1.img,cache=off,format=raw,id=hd0,if=none \
>    -device scsi-hd,drive=hd0 \

Don't you have to specify bus= too?

Alex

>    -drive 
> file=/home/rjones/d/libguestfs/.guestfs-1000/root.26645,snapshot=on,id=appliance,if=none,cache=unsafe
>  \
>    -device scsi-hd,drive=appliance \
>    -M pseries \
>    -enable-kvm \
>    -machine accel=kvm:tcg \
>    -m 500 \
>    -no-reboot \
>    -device virtio-serial \
>    -serial stdio \
>    -chardev 
> socket,path=/home/rjones/d/libguestfs/libguestfscoRCTO/guestfsd.sock,id=channel0
>  \
>    -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \
>    -kernel /home/rjones/d/libguestfs/.guestfs-1000/kernel.26645 \
>    -initrd /home/rjones/d/libguestfs/.guestfs-1000/initrd.26645 \
>    -append 'panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off 
> printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 
> TERM=screen '
> 
> guest kernel output:
> 
>  Welcome to Open Firmware
> 
>  Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
>  This program and the accompanying materials are made available
>  under the terms of the BSD License available at
>  http://www.opensource.org/licenses/bsd-license.php
> 
> Booting from memory...
> OF stdout device is: /vdevice/address@hidden
> Preparing to boot Linux version 3.3.4-5.fc17.ppc64 (address@hidden) (gcc 
> version 4.7.0 20120504 (Red Hat 4.7.0-4) (GCC) ) #1 SMP Mon May 14 10:18:37 
> MST 2012
> Detected machine type: 0000000000000101
> Max number of cores passed to firmware: 1024 (NR_CPUS = 1024)
> Calling ibm,client-architecture-support... not implemented
> couldn't open /packages/elf-loader
> command line: panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off 
> printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 
> TERM=screen 
> memory layout at init:
>  memory_limit : 0000000000000000 (16 MB aligned)
>  alloc_bottom : 0000000001a50000
>  alloc_top    : 000000001f400000
>  alloc_top_hi : 000000001f400000
>  rmo_top      : 000000001f400000
>  ram_top      : 000000001f400000
> instantiating rtas at 0x000000001cff0000... done
> Querying for OPAL presence... not there.
> boot cpu hw idx 0
> copying OF device tree...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x0000000001c60000 -> 0x0000000001c605e0
> Device tree struct  0x0000000001c70000 -> 0x0000000001c80000
> Calling quiesce...
> returning from prom_init
> [    0.000000] Phyp-dump not supported on this hardware
> [    0.000000] Using pSeries machine description
> [    0.000000] Using 1TB segments
> [    0.000000] Found initrd at 0xc000000001a50000:0xc000000001b7c400
> [    0.000000] bootconsole [udbg0] enabled
> [    0.000000] CPU maps initialized for 1 thread per core
> [    0.000000] Starting Linux PPC64 #1 SMP Mon May 14 10:18:37 MST 2012
> [    0.000000] -----------------------------------------------------
> [    0.000000] ppc64_pft_size                = 0x18
> [    0.000000] physicalMemorySize            = 0x1f400000
> [    0.000000] htab_hash_mask                = 0x1ffff
> [    0.000000] -----------------------------------------------------
> [    0.000000] Initializing cgroup subsys cpuset
> [    0.000000] Initializing cgroup subsys cpu
> [    0.000000] Linux version 3.3.4-5.fc17.ppc64 (address@hidden) (gcc version 
> 4.7.0 20120504 (Red Hat 4.7.0-4) (GCC) ) #1 SMP Mon May 14 10:18:37 MST 2012
> 
> CF000012
> Setup Arch[    0.000000] [boot]0012 Setup Arch
> [    0.000000] PCI host bridge /address@hidden,0  ranges:
> [    0.000000]   IO 0x0000010080000000..0x000001008000ffff -> 
> 0x0000000000000000
> [    0.000000]  MEM 0x00000100a0000000..0x00000100bfffffff -> 
> 0x0000000080000000 
> [    0.000000] Zone PFN ranges:
> [    0.000000]   DMA      0x00000000 -> 0x00001f40
> [    0.000000]   Normal   empty
> [    0.000000] Movable zone start PFN for each node
> [    0.000000] Early memory PFN ranges
> [    0.000000]     0: 0x00000000 -> 0x00001f40
> 
> CF000015
> Setup Done[    0.000000] [boot]0015 Setup Done
> [    0.000000] PERCPU: Embedded 2 pages/cpu @c000000001d00000 s84608 r0 
> d46464 u1048576
> [    0.000000] Built 1 zonelists in Node order, mobility grouping on.  Total 
> pages: 7993
> [    0.000000] Policy zone: DMA
> [    0.000000] Kernel command line: panic=1 console=ttyS0 udevtimeout=600 
> no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb 
> selinux=0 guestfs_verbose=1 TERM=screen 
> [    0.000000] Disabling memory control group subsystem
> [    0.000000] PID hash table entries: 2048 (order: -2, 16384 bytes)
> [    0.000000] freeing bootmem node 0
> [    0.000000] Memory: 486336k/512000k available (17920k kernel code, 25664k 
> reserved, 1856k data, 2952k bss, 6656k init)
> [    0.000000] SLUB: Genslabs=19, HWalign=128, Order=0-3, MinObjects=0, 
> CPUs=1, Nodes=256
> [    0.000000] Hierarchical RCU implementation.
> [    0.000000] NR_IRQS:512 nr_irqs:512 16
> [    0.000000] clocksource: timebase mult[1f40000] shift[24] registered
> [    0.000000] Console: colour dummy device 80x25
> [    0.000000] Phyp-dump not supported on this hardware
> [    0.000000] Using pSeries machine description
> [    0.000000] Using 1TB segments
> [    0.000000] Found initrd at 0xc000000001a50000:0xc000000001b7c400
> [    0.000000] bootconsole [udbg0] enabled
> [    0.000000] CPU maps initialized for 1 thread per core
> [    0.000000] Starting Linux PPC64 #1 SMP Mon May 14 10:18:37 MST 2012
> [    0.000000] -----------------------------------------------------
> [    0.000000] ppc64_pft_size                = 0x18
> [    0.000000] physicalMemorySize            = 0x1f400000
> [    0.000000] htab_hash_mask                = 0x1ffff
> [    0.000000] -----------------------------------------------------
> [    0.000000] Initializing cgroup subsys cpuset
> [    0.000000] Initializing cgroup subsys cpu
> [    0.000000] Linux version 3.3.4-5.fc17.ppc64 (address@hidden) (gcc version 
> 4.7.0 20120504 (Red Hat 4.7.0-4) (GCC) ) #1 SMP Mon May 14 10:18:37 MST 2012
> [    0.000000] [boot]0012 Setup Arch
> [    0.000000] PCI host bridge /address@hidden,0  ranges:
> [    0.000000]   IO 0x0000010080000000..0x000001008000ffff -> 
> 0x0000000000000000
> [    0.000000]  MEM 0x00000100a0000000..0x00000100bfffffff -> 
> 0x0000000080000000 
> [    0.000000] Zone PFN ranges:
> [    0.000000]   DMA      0x00000000 -> 0x00001f40
> [    0.000000]   Normal   empty
> [    0.000000] Movable zone start PFN for each node
> [    0.000000] Early memory PFN ranges
> [    0.000000]     0: 0x00000000 -> 0x00001f40
> [    0.000000] [boot]0015 Setup Done
> [    0.000000] PERCPU: Embedded 2 pages/cpu @c000000001d00000 s84608 r0 
> d46464 u1048576
> [    0.000000] Built 1 zonelists in Node order, mobility grouping on.  Total 
> pages: 7993
> [    0.000000] Policy zone: DMA
> [    0.000000] Kernel command line: panic=1 console=ttyS0 udevtimeout=600 
> no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb 
> selinux=0 guestfs_verbose=1 TERM=screen 
> [    0.000000] Disabling memory control group subsystem
> [    0.000000] PID hash table entries: 2048 (order: -2, 16384 bytes)
> [    0.000000] freeing bootmem node 0
> [    0.000000] Memory: 486336k/512000k available (17920k kernel code, 25664k 
> reserved, 1856k data, 2952k bss, 6656k init)
> [    0.000000] SLUB: Genslabs=19, HWalign=128, Order=0-3, MinObjects=0, 
> CPUs=1, Nodes=256
> [    0.000000] Hierarchical RCU implementation.
> [    0.000000] NR_IRQS:512 nr_irqs:512 16
> [    0.000000] clocksource: timebase mult[1f40000] shift[24] registered
> [    0.000000] Console: colour dummy device 80x25
> [    0.000000] console [hvc0] enabled
> [    0.000000] console [hvc0] enabled
> [    0.041700] pid_max: default: 32768 minimum: 301
> [    0.041700] pid_max: default: 32768 minimum: 301
> [    0.048107] Security Framework initialized
> [    0.048107] Security Framework initialized
> [    0.067154] SELinux:  Disabled at boot.
> [    0.067154] SELinux:  Disabled at boot.
> [    0.084262] Dentry cache hash table entries: 65536 (order: 3, 524288 bytes)
> [    0.084262] Dentry cache hash table entries: 65536 (order: 3, 524288 bytes)
> [    0.099618] Inode-cache hash table entries: 32768 (order: 2, 262144 bytes)
> [    0.099618] Inode-cache hash table entries: 32768 (order: 2, 262144 bytes)
> [    0.107083] Mount-cache hash table entries: 4096
> [    0.107083] Mount-cache hash table entries: 4096
> [    0.155933] Initializing cgroup subsys cpuacct
> [    0.155933] Initializing cgroup subsys cpuacct
> [    0.156562] Initializing cgroup subsys memory
> [    0.156562] Initializing cgroup subsys memory
> [    0.161423] Initializing cgroup subsys devices
> [    0.161423] Initializing cgroup subsys devices
> [    0.162250] Initializing cgroup subsys freezer
> [    0.162250] Initializing cgroup subsys freezer
> [    0.162992] Initializing cgroup subsys net_cls
> [    0.162992] Initializing cgroup subsys net_cls
> [    0.163913] Initializing cgroup subsys blkio
> [    0.163913] Initializing cgroup subsys blkio
> [    0.164843] Initializing cgroup subsys perf_event
> [    0.164843] Initializing cgroup subsys perf_event
> [    0.169308] ftrace: allocating 21118 entries in 8 pages
> [    0.169308] ftrace: allocating 21118 entries in 8 pages
> [    0.439808] POWER7 performance monitor hardware support registered
> [    0.439808] POWER7 performance monitor hardware support registered
> [    0.476013] Brought up 1 CPUs
> [    0.476013] Brought up 1 CPUs
> [    0.481103] Enabling Asymmetric SMT scheduling
> [    0.481103] Enabling Asymmetric SMT scheduling
> [    0.552049] devtmpfs: initialized
> [    0.552049] devtmpfs: initialized
> [    0.673170] atomic64 test passed
> [    0.673170] atomic64 test passed
> [    0.680501] NET: Registered protocol family 16
> [    0.680501] NET: Registered protocol family 16
> [    0.686950] IBM eBus Device Driver
> [    0.686950] IBM eBus Device Driver
> [    0.713306] nvram: No room to create ibm,rtas-log partition, deleting any 
> obsolete OS partitions...
> [    0.713306] nvram: No room to create ibm,rtas-log partition, deleting any 
> obsolete OS partitions...
> [    0.714363] nvram: Failed to find or create ibm,rtas-log partition, err -28
> [    0.714363] nvram: Failed to find or create ibm,rtas-log partition, err -28
> [    0.715042] nvram: No room to create lnx,oops-log partition, deleting any 
> obsolete OS partitions...
> [    0.715042] nvram: No room to create lnx,oops-log partition, deleting any 
> obsolete OS partitions...
> [    0.715559] nvram: Failed to find or create lnx,oops-log partition, err -28
> [    0.715559] nvram: Failed to find or create lnx,oops-log partition, err -28
> 
> Linux ppc64
> #1 SMP Mon May 1[    0.720031] CPU Hotplug not supported by firmware - 
> disabling.
> [    0.720031] CPU Hotplug not supported by firmware - disabling.
> [    0.740887] PCI: Probing PCI hardware
> [    0.740887] PCI: Probing PCI hardware
> [    0.749913] PCI host bridge to bus 0000:00
> [    0.749913] PCI host bridge to bus 0000:00
> [    0.751921] pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff]
> [    0.751921] pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff]
> [    0.752932] pci_bus 0000:00: root bus resource [mem 
> 0x100a0000000-0x100bfffffff]
> [    0.752932] pci_bus 0000:00: root bus resource [mem 
> 0x100a0000000-0x100bfffffff]
> [    0.765676] pci_dma_dev_setup_pSeriesLP: no DMA window found for pci 
> dev=0000:00:00.0 dn=/address@hidden,0/address@hidden
> [    0.765676] pci_dma_dev_setup_pSeriesLP: no DMA window found for pci 
> dev=0000:00:00.0 dn=/address@hidden,0/address@hidden
> [    0.773227] pci_dma_dev_setup_pSeriesLP: no DMA window found for pci 
> dev=0000:00:01.0 dn=/address@hidden,0/address@hidden
> [    0.773227] pci_dma_dev_setup_pSeriesLP: no DMA window found for pci 
> dev=0000:00:01.0 dn=/address@hidden,0/address@hidden
> [    0.787177] opal: Node not found
> [    0.787177] opal: Node not found
> [    0.831635] bio: create slab <bio-0> at 0
> [    0.831635] bio: create slab <bio-0> at 0
> [    0.854552] vgaarb: loaded
> [    0.854552] vgaarb: loaded
> [    0.861796] SCSI subsystem initialized
> [    0.861796] SCSI subsystem initialized
> [    0.873008] usbcore: registered new interface driver usbfs
> [    0.873008] usbcore: registered new interface driver usbfs
> [    0.874925] usbcore: registered new interface driver hub
> [    0.874925] usbcore: registered new interface driver hub
> [    0.877584] usbcore: registered new device driver usb
> [    0.877584] usbcore: registered new device driver usb
> [    0.915016] NetLabel: Initializing
> [    0.915016] NetLabel: Initializing
> [    0.915419] NetLabel:  domain hash size = 128
> [    0.915419] NetLabel:  domain hash size = 128
> [    0.915688] NetLabel:  protocols = UNLABELED CIPSOv4
> [    0.915688] NetLabel:  protocols = UNLABELED CIPSOv4
> [    0.921383] NetLabel:  unlabeled traffic allowed by default
> [    0.921383] NetLabel:  unlabeled traffic allowed by default
> [    0.923702] Switching to clocksource timebase
> [    0.923702] Switching to clocksource timebase
> [    1.354987] NET: Registered protocol family 2
> [    1.354987] NET: Registered protocol family 2
> [    1.366159] IP route cache hash table entries: 8192 (order: 0, 65536 bytes)
> [    1.366159] IP route cache hash table entries: 8192 (order: 0, 65536 bytes)
> [    1.385317] TCP established hash table entries: 16384 (order: 2, 262144 
> bytes)
> [



reply via email to

[Prev in Thread] Current Thread [Next in Thread]