|
From: | Jacob Godin |
Subject: | Re: [Qemu-discuss] Disk Corruption |
Date: | Wed, 1 Jun 2016 15:13:11 -0300 |
Please clarify a few things for the other people on this list (I don't have a solution for your issue, but would like it to be solved just to improve the reliability of my own qcow2 disks):
On 01/06/2016 17:47, Jacob Godin wrote:
Hi all,Is this Ubuntu?
Been running into an issue with qcow2 disk corruption, hoping we can get pointed in the right direction. We're currently using latest qemu from Trusty.
What is the numeric Ubuntu version?
What is the actual qemu package versions you use ("latest" isn't exactly precise)?
The issue started after powering a VM off and on again. One first boot, the guest (CentOS 6) started reporting I/O issues almost immediately and then crashed. Following that, the VM was unable to read the disk (kept looping through BIOS boot process).
How did you "power off" the VM?
Did you use some qemu management tool (which one and which version)?
Did you kill the qemu process?
Did you do a "clean" shutdown of the Guest OS and wait for the Guest OS to tell the qemu process to exit on its own?
(Note: The latter should not be a requirement for the qcow2 meta-data to survive, only for the disk image inside to be an image of a clean or unclean disk, however it may matter as to how the bug was triggered).
When you "attempted to apply the snapshot", which tool (and version) did you use?
The disk has a single snapshot, which we were able to get working by following this process:
* Attempt to apply snap. Supposedly fails.
* Run qemu-img check + repair
* Use qemu-img convert to convert qcow2 to qcow2
Once complete, we were able to boot from the disk, however it was at the point that the snapshot was taken. We have attempted to do a check+repair and then convert without applying the snapshot, but are running into the following errors:
* qemu-img check + repair:
Warning: cluster offset=0x2d3120706a0000 is after the end of
the image file, can't properly check refcounts.
ERROR offset=2d312070696e00: Cluster is not properly aligned;
L2 entry corrupted.
Warning: cluster offset=0x2d310a43500000 is after the end of
the image file, can't properly check refcounts.
Warning: cluster offset=0x2d310a43510000 is after the end of
the image file, can't properly check refcounts.
ERROR offset=2d310a43505500: Cluster is not properly aligned;
L2 entry corrupted.
Warning: cluster offset=0x20496e74650000 is after the end of
the image file, can't properly check refcounts.
Warning: cluster offset=0x20496e74660000 is after the end of
the image file, can't properly check refcounts.
ERROR offset=20496e74656c00: Cluster is not properly aligned;
L2 entry corrupted.
Warning: cluster offset=0x2f6d6d6f6e0000 is after the end of
the image file, can't properly check refcounts.
Warning: cluster offset=0xd2070726f0000 is after the end of
the image file, can't properly check refcounts.
Warning: cluster offset=0xd207072700000 is after the end of
the image file, can't properly check refcounts.
Warning: cluster offset=0x336f7220730000 is after the end of
the image file, can't properly check refcounts.
* qemu-img convert:
qemu-img: error while reading block status of sector 147456:
Input/output error
Here's qemu-img from that disk:
image: disk.pre-convert
file format: qcow2
virtual size: 180G (193273528320 bytes)
disk size: 153G
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/xxx
Snapshot list:
ID TAG VM SIZE DATE VM CLOCK
67 xxx 0 2016-04-14 05:22:34 00:00:00.000
Note that the virtual size has been increased from 80G. It previously looked like this:
image: disk.pre-convert
file format: qcow2
virtual size: 80G (85899345920 bytes)
disk size: 153G
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/c45e2e81d34824861271a098bccd5585128e2c05
Snapshot list:
ID TAG VM SIZE DATE VM CLOCK
67 e50825fbd43e455283ef847b12eaea4c 0 2016-04-14 05:22:34 00:00:00.000
We've tried using qcow2.py from src to clear the snapshot headers, however it didn't help.
Enjoy
Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
[Prev in Thread] | Current Thread | [Next in Thread] |