[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v1 0/6]: block: Add I/O status support
From: |
Luiz Capitulino |
Subject: |
Re: [Qemu-devel] [PATCH v1 0/6]: block: Add I/O status support |
Date: |
Fri, 23 Sep 2011 10:48:53 -0300 |
On Fri, 23 Sep 2011 10:55:39 +0200
Markus Armbruster <address@hidden> wrote:
> Kevin Wolf <address@hidden> writes:
>
> > Am 19.09.2011 16:09, schrieb Luiz Capitulino:
> >> On Mon, 19 Sep 2011 15:40:06 +0200
> >> Kevin Wolf <address@hidden> wrote:
> >>
> >>> Am 01.09.2011 20:37, schrieb Luiz Capitulino:
> >>>> This series adds support to the block layer to keep track of devices'
> >>>> I/O status. That information is also made available in QMP and HMP.
> >>>>
> >>>> The goal here is to allow management applications that miss the
> >>>> BLOCK_IO_ERROR event to able to query the VM to determine if any device
> >>>> has
> >>>> caused the VM to stop and which device caused it.
> >>>>
> >>>> Please, note that this series depends on the following series:
> >>>>
> >>>> o [PATCH v3 0/8]: Introduce the RunState type
> >>>> o http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg00118.html
> >>>>
> >>>> And to be able to properly test it you'll also need:
> >>>>
> >>>> o [PATCH 0/3] qcow2/coroutine fixes
> >>>> o http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg00074.html
> >>>>
> >>>> Here's an HMP example:
> >>>>
> >>>> (qemu) info status
> >>>> VM status: paused (io-error)
> >>>> (qemu) info block
> >>>> ide0-hd0: removable=0 io-status=ok file=disks/test2.img ro=0 drv=qcow2
> >>>> encrypted=0
> >>>> ide0-hd1: removable=0 io-status=nospace file=/dev/vg_doriath/kvmtest
> >>>> ro=0 drv=qcow2 encrypted=0
> >>>> ide1-cd0: removable=1 locked=0 io-status=ok [not inserted]
> >>>> floppy0: removable=1 locked=0 [not inserted]
> >>>> sd0: removable=1 locked=0 [not inserted]
> >>>>
> >>>> The "info status" command shows that the VM is stopped due to an I/O
> >>>> error.
> >>>> By issuing "info block" it's possible to determine that the 'ide0-hd1'
> >>>> device
> >>>> caused the error, which turns out to be due to no space.
> >>>
> >>> Looks like I didn't reply here yet?
> >>
> >> No, you didn't.
> >>
> >>> I still don't quite like that the devices are involved, but their part
> >>> is minimal and it makes the implementation much easier, so for me that's
> >>> acceptable.
> >>
> >> Suggestions on better ways of implementing this are welcome! :)
> >
> > I don't have one. :-)
> >
> > Surely it would be possible to do everything in block.c, but I see that
> > this would make things much more complicated (would need to get an AIO
> > callback added to the chain that can save the error code). It's not
> > worth the trouble.
>
> The block layer necessarily detects errors in a huge number of places.
>
> It also has a rich and complex interface to device models, with error
> reporting in many places.
>
> virtio-blk, ide and scsi-disk each have a single error handler to handle
> block layer errors.
>
> For better or worse, these are the only block layer error "chokepoints"
> we have now, and therefore the only easy places to set BDS I/O status.
> But they're still wrong.
>
> The block layer shouldn't require device models to tell it what it
> already knows.
>
> Now, I don't want to block your series just because it adds this wart.
> But the wart should be documented in the code.
Is something like:
/* XXX: this should be implemented in the block layer */
good enough?
> Also document that any device model implementing werror=stop|enospc or
> rerror=stop must update I/O status. Extra points if you find a way to
> enforce it instead.
Why? It's probably a nice thing because we can know which device caused
the VM to stop, but I don't see it as a must have (it's probably a
no brainer to update the I/O status though).
>
> Apropos enforcing. Currently, -drive accepts any werror and rerror
> action with if={ide,virtio,scsi,none}. We rely on device models not
> implementing an action to check and fail during initialization.
> scsi-generic and the floppy devices do. All the other qdevified devices
> don't, and that's broken.
I think I can fix that, but this series doesn't depend on that, right?
(in the meaning that we could merge this before getting that problem fixed).
>
> Are werror and rerror really host state?
>
- [Qemu-devel] [PATCH 4/6] scsi: Support I/O status, (continued)
- [Qemu-devel] [PATCH 4/6] scsi: Support I/O status, Luiz Capitulino, 2011/09/01
- [Qemu-devel] [PATCH 5/6] QMP: query-status: Add 'io-status' key, Luiz Capitulino, 2011/09/01
- [Qemu-devel] [PATCH 6/6] HMP: Print 'io-status' information, Luiz Capitulino, 2011/09/01
- Re: [Qemu-devel] [PATCH v1 0/6]: block: Add I/O status support, Luiz Capitulino, 2011/09/09
- Re: [Qemu-devel] [PATCH v1 0/6]: block: Add I/O status support, Kevin Wolf, 2011/09/19