[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty
From: |
Lukáš Doktor |
Subject: |
[Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas |
Date: |
Wed, 21 Aug 2019 13:08:34 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 |
Hello guys,
I bisected a regression on aarch64 all the way to this commit: "qcow2: skip
writing zero buffers to empty COW areas"
c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a. Would you please have a look at it?
My reproducer is running kickstart installation of RHEL-8 from DVD on aarch64
gicv3 machine, which never finishes since this commit, where anaconda complains
about package installation, occasionally there are also XFS metadata corruption
messages on serial console:
2019-08-18 19:36:11: [22;1HInstalling check.aarch64 (359/1180)[1;24r[23;1H[
439.335155] XFS (dm-3): Metadata corruption detected at
xfs_inode_buf_verify+0x174/0x190 [xfs], xfs_inode block 0xb1d960
xfs_inode_buf_verify
2019-08-18 19:36:11: [ 439.345295] XFS (dm-3): Unmount and run xfs_repair
2019-08-18 19:36:11: [ 439.349167] XFS (dm-3): First 128 bytes of corrupted
metadata buffer:
2019-08-18 19:36:11: [1;23r[23;80H
2019-08-18 19:36:11: [22;1HInstalling libidn.aarch64 (360/1180)[1;24r[23;1H[
439.354063] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................
2019-08-18 19:36:11: [ 439.365513] 00000010: 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.371929] 00000020: 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.378175] 00000030: 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.384242] 00000040: 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.390603] 00000050: 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.396784] 00000060: 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.402838] 00000070: 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.409314] XFS (dm-3): metadata I/O error in
"xfs_trans_read_buf_map" at daddr 0xb1d960 len 32 error 117
2019-08-18 19:36:11: [ 439.416866] XFS (dm-3): xfs_imap_to_bp:
xfs_trans_read_buf() returned error -117.
2019-08-18 19:36:12: [ 440.341560] systemd-journald[1550]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-18 19:36:12:
================================================================================[23;80H[1;23r[23;80H
2019-08-18 19:36:12:
2019-08-18 19:36:12:
[22;1H================================================================================[23;1HError[1;24r[23;6H[
440.371152] systemd-journald[1550]: Journal effective settings seal=yes
compress=yes compress_threshold_bytes=512B
2019-08-18 19:36:12: [ 440.381579] systemd-journald[1550]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-18 19:36:12: [1;23r[23;80H
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12: [18;1H The following error occurred while installing.
This is a fatal error and
2019-08-18 19:36:12: installation will be aborted.
2019-08-18 19:36:12:
2019-08-18 19:36:12: DNF error: Error unpacking rpm package
libidn-1.34-5.el8.aarch64[23;1HPress ENTER to exit: [1;24r[23;22H[
440.401675] systemd-journald[1550]: Journal effective settings seal=yes
compress=yes compress_threshold_bytes=512B
It reproduces 100%, usually at different stage (different RPM package). With
Avocado-vt you can reproduce it by putting RHEL-8 DVD image to RHEL.7.devel iso
location and executing something like this:
avocado --show all run --vt-guest-os RHEL.7.devel --vt-extra-params
store_vm_register=no take_regular_screendumps=no monitor_type=human
--vt-qemu-bin /home/jenkins/aarch64/qemu-master-build/bin/qemu-system-aarch64
-- unattended_install.cdrom.extra_cdrom_ks.default_install.aio_threads
Note: I'm still using RHEL.7.devel profile although the provided DVD is RHEL8,
it doesn't really matter
Note: The "--vt-extra-params" are workarounds of current issue where frequent
use of QMP monitor results in bad things on aarch64
But it should reproduce even without Avocado-vt. The used qemu cmd was:
MALLOC_PERTURB_=1
/home/jenkins/aarch64/qemu-master-build/bin/qemu-system-aarch64 \
-S \
-name 'avocado-vt-vm1' \
-sandbox off \
-drive
file=/usr/share/AAVMF/AAVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on \
-drive
file=/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-aarch64_AAVMF_VARS.fd,if=pflash,format=raw,unit=1
\
-machine virt,gic-version=host \
-nodefaults \
-device virtio-gpu-pci,bus=pcie.0,addr=0x1 \
-chardev
socket,id=hmp_id_qmpmonitor1,path=/var/tmp/avocado_kurmsdn9/monitor-qmpmonitor1-20190821-065917-I0T8uXz4,server,nowait
\
-mon chardev=hmp_id_qmpmonitor1,mode=readline \
-chardev
socket,id=hmp_id_catch_monitor,path=/var/tmp/avocado_kurmsdn9/monitor-catch_monitor-20190821-065917-I0T8uXz4,server,nowait
\
-mon chardev=hmp_id_catch_monitor,mode=readline \
-serial
unix:'/var/tmp/avocado_kurmsdn9/serial-serial0-20190821-065917-I0T8uXz4',server,nowait
\
-device
pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \
-device
pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0-root-port-3,addr=0x0
\
-drive
id=drive_image1,if=none,format=qcow2,file=/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-aarch64.qcow2
\
-device scsi-hd,id=image1,drive=drive_image1 \
-device
pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
-device
virtio-net-pci,mac=9a:96:87:89:70:c0,rombar=0,id=idfUqVrx,netdev=idNvVkD7,bus=pcie.0-root-port-4,addr=0x0
\
-netdev tap,id=idNvVkD7,fd=20 \
-m 4096 \
-smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \
-cpu 'host' \
-drive
id=drive_cd1,if=none,media=cdrom,file=/home/isos/RHEL-8.1.0-20190820.n.0-aarch64-dvd1.iso
\
-device scsi-cd,id=cd1,drive=drive_cd1 \
-drive
id=drive_unattended,if=none,media=cdrom,file=/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7-64/ks.iso
\
-device scsi-cd,id=unattended,drive=drive_unattended \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-kernel
'/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7-aarch64/vmlinuz' \
-append 'ks=cdrom inst.sshd nicdelay=60 earlyprintk=pl011,0x9000000
console=ttyAMA0 debug ignore_loglevel rootwait' \
-initrd
'/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7-aarch64/initrd.img' \
-vnc :0 \
-rtc base=utc,clock=host \
-enable-kvm \
-device
pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
-device
pcie-root-port,id=pcie_extra_root_port_1,slot=6,chassis=6,addr=0x6,bus=pcie.0
But you should find all the required information in the attached file. It's
worth mentioning that I haven't seen this issue on other architectures (s390x,
ppc64, ppc64le) and reproducibility is 100%.
Regards,
Lukáš
PS: During bisecting the metadata corruption was usually not there, so
attaching excerpt of serial console obtained with the commit
c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a:
2019-08-21 07:04:57: ^[[22;1HInstalling kernel-devel.aarch64
(311/1190)^[[1;24r^[[23;1H
2019-08-21 07:05:23: [ 346.780872] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:23: [ 346.788877] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:23: [ 346.796951] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:23: [ 346.805247] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.250148] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.258148] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.270263] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.287159] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.295778] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.304047] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49:
================================================================================^[[23;80H[
372.318162] systemd-journald[1558]: Journal effective settings seal=yes
compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.332918] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: ^[[1;23r^[[23;80H
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
^[[15;1H================================================================================^[[16;1HError^[[18;1H
The following error occurred while installing. This is a fatal error and
2019-08-21 07:05:49: installation will be aborted.
2019-08-21 07:05:49:
2019-08-21 07:05:49: DNF error: Error in POSTIN scriptlet in rpm package
kernel-devel^[[1;24r^[[23;1H[ 372.341178] systemd-journald[1558]: Journal
effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.371747] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: Press ENTER to exit:
2019-08-21 07:05:53: [ 377.143065] systemd-coredump[2890]: Failed to get COMM,
falling back to the command line: No such process
2019-08-21 07:05:53: [ 377.150559] systemd-coredump[2890]: Failed to get EXE,
ignoring: No such process
2019-08-21 07:05:53: [ 377.200294] systemd-coredump[2892]: Failed to get COMM,
falling back to the command line: No such process
2019-08-21 07:05:53: [ 377.207611] systemd-coredump[2892]: Failed to get EXE,
ignoring: No such process
2019-08-21 07:05:54: [ 377.259821] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.267769] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.278325] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.286222] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.297440] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.307166] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.318449] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.326360] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.337852] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.346173] systemd-journald[1558]: Journal effective
settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:56: [ 379.428425] systemd-coredump[2909]: Failed to get COMM,
falling back to the command line: No such process
2019-08-21 07:05:56: [ 379.435753] systemd-coredump[2909]: Failed to get EXE,
ignoring: No such process
2019-08-21 07:05:57: [ 380.880584] systemd-coredump[2919]: Failed to get COMM,
falling back to the command line: No such process
2019-08-21 07:05:57: [ 380.887926] systemd-coredump[2919]: Failed to get EXE,
ignoring: No such process
2019-08-21 07:05:58: [ 381.526076] systemd-coredump[2922]: Failed to get COMM,
falling back to the command line: No such process
2019-08-21 07:05:58: [ 381.533709] systemd-coredump[2922]: Failed to get EXE,
ignoring: No such process
094-functional_git_RHEL-8_dvd io-github-autotest-qemu.unattended_install.cdrom.extra_cdrom_ks.default_install.aio_threads_-e630.zip
Description: Zip archive
signature.asc
Description: OpenPGP digital signature
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas,
Lukáš Doktor <=