Re: [PATCH v3 0/9] block-backend: Introduce I/O hang

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 0/9] block-backend: Introduce I/O hang

From:	cenjiahui
Subject:	Re: [PATCH v3 0/9] block-backend: Introduce I/O hang
Date:	Thu, 29 Oct 2020 17:42:42 +0800
User-agent:	Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2

On 2020/10/27 0:53, Stefan Hajnoczi wrote:
> On Thu, Oct 22, 2020 at 09:02:54PM +0800, Jiahui Cen wrote:
>> A VM in the cloud environment may use a virutal disk as the backend storage,
>> and there are usually filesystems on the virtual block device. When backend
>> storage is temporarily down, any I/O issued to the virtual block device will
>> cause an error. For example, an error occurred in ext4 filesystem would make
>> the filesystem readonly. However a cloud backend storage can be soon 
>> recovered.
>> For example, an IP-SAN may be down due to network failure and will be online
>> soon after network is recovered. The error in the filesystem may not be
>> recovered unless a device reattach or system restart. So an I/O rehandle is
>> in need to implement a self-healing mechanism.
>>
>> This patch series propose a feature called I/O hang. It can rehandle AIOs
>> with EIO error without sending error back to guest. From guest's perspective
>> of view it is just like an IO is hanging and not returned. Guest can get
>> back running smoothly when I/O is recovred with this feature enabled.
> 
> Hi,
> This feature seems like an extension of the existing -drive
> rerror=/werror= parameters:
> 
>   werror=action,rerror=action
>       Specify which action to take on write and read errors. Valid
>       actions are: "ignore" (ignore the error and try to continue),
>       "stop" (pause QEMU), "report" (report the error to the guest),
>       "enospc" (pause QEMU only if the host disk is full; report the
>       error to the guest otherwise).  The default setting is
>       werror=enospc and rerror=report.
> 
> That mechanism already has a list of requests to retry and live
> migration integration. Using the werror=/rerror= mechanism would avoid
> code duplication between these features. You could add a
> werror/rerror=retry error action for this feature.
> 
> Does that sound good?
> 
> Stefan
> 

Hi Stefan,

Thanks for your reply. Extending the rerror=/werror= mechanism is a feasible
way for the retry feature.

However, AFAIK, the rerror=/werror= mechanism in block-backend layer only
provides ACTION, and the real handler of errors need be implemented several
times in device layer for different devices. While our I/O Hang mechanism
directly handles AIO errors no matter which type of devices it is. Is it a
more common way to implement the feature in block-backend layer? Especially we
can set retry timeout in a common structure BlockBackend.

Besides, is there any reason that QEMU implements the rerror=/werror mechansim
in device layer rather than in block-backend layer?

Jiahui

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v3 3/9] block-backend: add I/O hang timeout, (continued)
- [PATCH v3 3/9] block-backend: add I/O hang timeout, Jiahui Cen, 2020/10/22
- [PATCH v3 1/9] block-backend: introduce I/O rehandle info, Jiahui Cen, 2020/10/22
- [PATCH v3 5/9] block-backend: enable I/O hang when timeout is set, Jiahui Cen, 2020/10/22
- [PATCH v3 4/9] block-backend: add I/O rehandle pause/unpause, Jiahui Cen, 2020/10/22
- [PATCH v3 7/9] qemu-option: add I/O hang timeout option, Jiahui Cen, 2020/10/22
- [PATCH v3 8/9] qapi: add I/O hang and I/O hang timeout qapi event, Jiahui Cen, 2020/10/22
- [PATCH v3 2/9] block-backend: rehandle block aios when EIO, Jiahui Cen, 2020/10/22
- [PATCH v3 6/9] virtio-blk: pause I/O hang when resetting, Jiahui Cen, 2020/10/22
- [PATCH v3 9/9] docs: add a doc about I/O hang, Jiahui Cen, 2020/10/22
- Re: [PATCH v3 0/9] block-backend: Introduce I/O hang, Stefan Hajnoczi, 2020/10/26
  - Re: [PATCH v3 0/9] block-backend: Introduce I/O hang, cenjiahui <=
    - Re: [PATCH v3 0/9] block-backend: Introduce I/O hang, Stefan Hajnoczi, 2020/10/30

Prev by Date: [PATCH-for-6.0 v2 25/25] block/nvme: Simplify Completion Queue Command Identifier field use
Next by Date: [PULL 0/5] Misc next patches
Previous by thread: Re: [PATCH v3 0/9] block-backend: Introduce I/O hang
Next by thread: Re: [PATCH v3 0/9] block-backend: Introduce I/O hang
Index(es):
- Date
- Thread