[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EI
From: |
Eric Farman |
Subject: |
Re: [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO |
Date: |
Tue, 7 Apr 2020 06:18:18 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 |
On 4/7/20 2:28 AM, Cornelia Huck wrote:
> On Mon, 6 Apr 2020 14:21:17 -0400
> Eric Farman <address@hidden> wrote:
>
>> On 4/1/20 4:52 AM, Cornelia Huck wrote:
>>> On Wed, 25 Mar 2020 03:24:28 +0100
>>> Halil Pasic <address@hidden> wrote:
>>>
>>>> On Tue, 24 Mar 2020 18:04:30 +0100
>>>> Cornelia Huck <address@hidden> wrote:
>>>>
>>>>> On Thu, 6 Feb 2020 22:45:03 +0100
>>>>> Eric Farman <address@hidden> wrote:
>>>>>
>>>>>> From: Farhan Ali <address@hidden>
>>>>>>
>>>>>> EIO is returned by vfio-ccw mediated device when the backing
>>>>>> host subchannel is not operational anymore. So return cc=3
>>>>>> back to the guest, rather than returning a unit check.
>>>>>> This way the guest can take appropriate action such as
>>>>>> issue an 'stsch'.
>>>>
>>>> I believe this is not the only situation when vfio-ccw returns
>>>> EIO, or?
>>>>
>>>>>>
>>>>>> Signed-off-by: Farhan Ali <address@hidden>
>>>>>> Signed-off-by: Eric Farman <address@hidden>
>>>>>> ---
>>>>>>
>>>>>> Notes:
>>>>>> v1->v2: [EF]
>>>>>> - Add s-o-b
>>>>>> - [Seems the discussion on v1 centered on the return code
>>>>>> set in the kernel, rather than anything that needs to
>>>>>> change here, unless I've missed something.]
>>>>
>>>> Does this need to change here? If the kernel is supposed to return ENODEV
>>>> then this does not need to change.
>>>>
>>>>>
>>>>> I've stared at this and at the kernel code for some time again; and I'm
>>>>> not sure if "return -EIO == not operational" is even true. That said,
>>>>> I'm not sure a unit check is the right response, either. The only thing
>>>>> I'm sure about is that the kernel code needs some review of return
>>>>> codes and some documentation...
>>>>
>>>> I could not agree more, this is semantically uapi and needs to be
>>>> properly documented.
>>>>
>>>> With regards to "linux error codes: vs "ionist cc's" an where
>>>> the mapping is different example:
>>>>
>>>> """
>>>> /**
>>>>
>>>> * cio_cancel_halt_clear - Cancel running I/O by performing cancel, halt
>>>>
>>>> * and clear ordinally if subchannel is valid.
>>>>
>>>> * @sch: subchannel on which to perform the cancel_halt_clear operation
>>>>
>>>> * @iretry: the number of the times remained to retry the next operation
>>>>
>>>> *
>>>>
>>>> * This should be called repeatedly since halt/clear are asynchronous
>>>>
>>>> * operations. We do one try with cio_cancel, three tries with cio_halt,
>>>>
>>>> * 255 tries with cio_clear. The caller should initialize @iretry with
>>>>
>>>> * the value 255 for its first call to this, and keep using the same
>>>>
>>>> * @iretry in the subsequent calls until it gets a non -EBUSY return.
>>>>
>>>> *
>>>>
>>>> * Returns 0 if device now idle, -ENODEV for device not operational,
>>>>
>>>> * -EBUSY if an interrupt is expected (either from halt/clear or from a
>>>>
>>>> * status pending), and -EIO if out of retries.
>>>>
>>>> */
>>>>
>>>> int cio_cancel_halt_clear(struct subchannel *sch, int *iretry)
>>>>
>>>> """
>>>> Here -ENODEV is not operational.
>>>
>>> Ok, I went through the kernel code and checked which error may be
>>> returned in which case (hope I caught all of them). Here's what I
>>> currently have:
>>
>> Thanks for doing all this mapping!
>>
>>>
>>> diff --git a/Documentation/s390/vfio-ccw.rst
>>> b/Documentation/s390/vfio-ccw.rst
>>> index fca9c4f5bd9c..43f375a03cce 100644
>>> --- a/Documentation/s390/vfio-ccw.rst
>>> +++ b/Documentation/s390/vfio-ccw.rst
...snip...
>>>
>>> The other return codes look sane, and the return codes for the async
>>> region also seem sane (but have the same issue with -EIO == "device in
>>> wrong state").
>>>
>>> Looking at the QEMU handling, I think the -EIO handling is a bit
>>> questionable (without an obvious solution), while mapping -EBUSY to cc
>>> 2 is the only reasonable thing to do given that the interface does not
>>> differentiate between "busy" and "status pending". The rest seems sane.
>>>
>>
>> So maybe with all this data, and absent a better solution, it's best to
>> just drop this patch from v3?
>
> Yes, I agree. I'll post the documentation update as a separate patch.
>
Sounds good; thanks!