[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-discuss] qemu-nbd or qcow2 or something else ?
From: |
Max Reitz |
Subject: |
Re: [Qemu-discuss] qemu-nbd or qcow2 or something else ? |
Date: |
Fri, 8 Dec 2017 15:44:47 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 |
On 2017-12-08 15:41, Programmingkid wrote:
>
>> On Dec 8, 2017, at 9:34 AM, Max Reitz <address@hidden> wrote:
>>
>> On 2017-12-05 19:29, Programmingkid wrote:
>>>>
>>>> Message: 1
>>>> Date: Mon, 4 Dec 2017 19:04:36 +0100
>>>> From: Max Reitz <address@hidden>
>>>> To: Pascal <address@hidden>, address@hidden,
>>>> address@hidden
>>>> Subject: Re: [Qemu-discuss] qemu-nbd or qcow2 or something else ?
>>>> Message-ID: <address@hidden>
>>>> Content-Type: text/plain; charset="utf-8"
>>>>
>>>> On 2017-12-01 18:56, Pascal wrote:
>>>>> hello,
>>>>>
>>>>> while doing some tests on the ntfs file system, I met some strange things
>>>>> with my qcow2 disk images.
>>>>> the images are on a partition mounted in tmpfs, but the result is the same
>>>>> when they are recorded on a partition in ext4 format.
>>>>> I don't know where the problem comes from : qemu-nbd or format qcow2 or
>>>>> something else ?
>>>>> do not hesitate if you want more informations.
>>>>
>>>> I see the issue here as well (and with raw, too).
>>>>
>>>> tl;dr: Seems like a kernel issue to me (CC-ing the NBD list because
>>>> that's the best I can do).
>>>>
>>>> When tracing the accesses, it appears that at least the NTFS header is
>>>> not read from the source disk when copying the data over. I would guess
>>>> this is due to caching, because Linux has read that sector before the
>>>> mkfs.ntfs (so it was zero then).
>>>>
>>>> And the issue disappears if I insert a "blockdev --flushbufs /dev/nbd0"
>>>> after the mkfs.ntfs -- but not if I flush nbd0p1, interestingly.
>>>>
>>>> I would guess the kernel has different caches for the whole device and
>>>> each partition? Well, that's nice. Not sure if that is a bug or
>>>> whether that is just how it is...
>>>>
>>>> (But nbd just uses the normal blockdev partitioning, so I guess it's by
>>>> design? (And the same issue appears with kpartx, too))
>>>>
>>>> You can also see this on the source volume alone:
>>>>
>>>> $ qemu-img create [...]
>>>> # qemu-nbd -c /dev/nbd0 [...]
>>>> # fdisk /dev/nbd0
>>>> # mkfs.ntfs /dev/nbd0p1
>>>>
>>>> # hexdump -C /dev/nbd0
>>>> [...]
>>>> 00000200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> |................|
>>>> *
>>>> 00102000 ff ff 00 07 00 00 00 00 00 00 00 00 00 00 00 00
>>>> |................|[...]
>>>>
>>>> # blockdev --flushbufs /dev/nbd0
>>>>
>>>> # hexdump -C /dev/nbd0
>>>> [...]
>>>> 00100000 eb 52 90 4e 54 46 53 20 20 20 20 00 02 08 00 00 |.R.NTFS
>>>> .....|
>>>> 00100010 00 00 00 00 00 f8 00 00 00 00 00 00 00 00 00 00
>>>> |................|
>>>> [...]
>>>>
>>>>
>>>> However, there is still an open question: I can't reproduce this with
>>>> loop or real devices. I only see this with NBD. Why? After having dug
>>>> for too long into the kernel sources, my best guess right now is that
>>>> the kernel NBD driver might be missing some necessary flushes. Whenever
>>>> one NBD device is accessed (through a partition or not), it is necessary
>>>> to flush all device nodes that are associated with it -- but loop
>>>> doesn't seem to be doing this, and I would expect the general partition
>>>> framework to handle this already. Therefore, my best guess is a bad guess.
>>>>
>>>> But note that I can reproduce the issue with nbd-server and nbd-client
>>>> just fine:
>>>>
>>>> # dd if=/dev/zero of=/tmp/foo.img bs=1M count=2048
>>>> 2048+0 records in
>>>> 2048+0 records out
>>>> 2147483648 bytes (2.1 GB, 2.0 GiB) copied, 0.517711 s, 4.1 GB/s
>>>>
>>>> # nbd-server 10809 /tmp/foo.img
>>>> ** (process:17331): WARNING **: Specifying an export on the command line
>>>> no longer uses the oldstyle protocol.
>>>>
>>>> # nbd-client localhost /dev/nbd0
>>>> Warning: the oldstyle protocol is no longer supported.
>>>> This method now uses the newstyle protocol with a default export
>>>> Negotiation: ..size = 2048MB
>>>> bs=1024, sz=2147483648 bytes
>>>>
>>>> # echo -e 'n\n\n\n\n\nt\n7\nw' | fdisk /dev/nbd0
>>>> [...]
>>>>
>>>> # mkfs.ntfs /dev/nbd0p1
>>>> [...]
>>>>
>>>> # hexdump -C /dev/nbd0 | less
>>>> [...]
>>>> 00000200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> |................|
>>>> *
>>>> 00102000 ff ff 00 07 00 00 00 00 00 00 00 00 00 00 00 00
>>>> |................|[...]
>>>>
>>>> # blockdev --flushbufs /dev/nbd0
>>>>
>>>> # hexdump -C /dev/nbd0 | less
>>>> [...]
>>>> 00100000 eb 52 90 4e 54 46 53 20 20 20 20 00 04 04 00 00 |.R.NTFS
>>>> .....|
>>>> 00100010 00 00 00 00 00 f8 00 00 00 00 00 00 00 00 00 00
>>>> |................|[...]
>>>>
>>>> Max
>>>
>>> I was just wondering if you have tried the qcow format yet. I was having
>>> issues with my qcow2 image file becoming corrupted to the point I couldn't
>>> boot Windows from it. When I switched over to qcow, the problem went away.
>>
>> As I've written, I have tried raw and saw the same.
>>
>> Also, qcow is deprecated. In fact, it's deprecated so much that I'm in
>> favor of removing it completely. Needless to say, if there's an issue
>> in qcow2 you should report it.
>>
>> Max
>>
>
> When I was trying to install and use Windows NT 4.0 the installation would
> boot once, then fail to boot anymore. After investigating this issue it was
> found out that switching from qcow2 to qcow fixed the problem. I really
> suggest keeping the qcow format. It works well.
It has horrible performance compared to qcow2 and nobody will fix this,
because we already did and the result is called qcow2.
Max
signature.asc
Description: OpenPGP digital signature