[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH RFC] vfio-ccw: forward halt/clear errors
From: |
Eric Farman |
Subject: |
Re: [PATCH RFC] vfio-ccw: forward halt/clear errors |
Date: |
Wed, 12 May 2021 16:19:15 -0400 |
On Tue, 2021-05-11 at 17:11 +0200, Cornelia Huck wrote:
> hsch and csch basically have two parts: execute the command,
> and perform the halt/clear function. For fully emulated
> subchannels, it is pretty clear how it will work: check the
> subchannel state, and actually 'perform the halt/clear function'
> and set cc 0 if everything looks good.
>
> For passthrough subchannels, some of the checking is done
> within QEMU, but some has to be done within the kernel. QEMU's
> subchannel state may be such that we can perform the async
> function, but the kernel may still get a cc != 0 when it is
> actually executing the instruction. In that case, we need to
> set the condition actually encountered by the kernel; if we
> set cc 0 on error, we would actually need to inject an interrupt
> as well.
>
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>
> Stumbled over this during the vfio-ccw kernel locking discussions.
>
> This is probably a corner case, and I'm not sure how I can actually
> get this path excercised, but it passes my smoke tests.
I'll see if I can hammer my way into some of this.
>
> Not sure whether this is the way to go.
I think it seems reasonable.
> The unit exceptions in the
> halt/clear error paths also seem slightly fishy.
It is peculiar. Looking at the old POPS references, the unit exception
states that the --device-- detected something unusual, not really the
subchannel (which is how vfio-ccw is behaving). But, providing some
indication that something went seriously wrong is good. Which I guess
was the point of the UE code, even though it's obviously set up to be
generated after a failure on the START.
I guess at the least, we need to clean up the FCTL based on the
function that failed, rather than only cleaning up the START function.
The UE itself may just be an extra "hey this is busted" indicator.
>
> ---
> hw/s390x/css.c | 34 ++++++++++++++++++++++++++++++----
> hw/vfio/ccw.c | 4 ++--
> 2 files changed, 32 insertions(+), 6 deletions(-)
>
> diff --git a/hw/s390x/css.c b/hw/s390x/css.c
> index bed46f5ec3a2..ce2e903ca25a 100644
> --- a/hw/s390x/css.c
> +++ b/hw/s390x/css.c
> @@ -1206,23 +1206,49 @@ static void
> sch_handle_start_func_virtual(SubchDev *sch)
>
> }
>
> -static void sch_handle_halt_func_passthrough(SubchDev *sch)
> +static IOInstEnding sch_handle_halt_func_passthrough(SubchDev *sch)
> {
> int ret;
>
> ret = s390_ccw_halt(sch);
> if (ret == -ENOSYS) {
> sch_handle_halt_func(sch);
> + return IOINST_CC_EXPECTED;
This is the fallback, so makes sense. You could fold it into the switch
table, but since that's for "stuff from the kernel" versus -ENOSYS says
"there's no way to call the kernel" I guess this is fine too.
> + }
> + /*
> + * Some conditions may have been detected prior to starting the
> halt
> + * function; map them to the correct cc.
> + */
> + switch (ret) {
> + case -EBUSY:
> + return IOINST_CC_BUSY;
> + case -ENODEV:
> + case -EACCES:
> + return IOINST_CC_NOT_OPERATIONAL;
> + default:
> + return IOINST_CC_EXPECTED;
> }
> }
>
> -static void sch_handle_clear_func_passthrough(SubchDev *sch)
> +static IOInstEnding sch_handle_clear_func_passthrough(SubchDev *sch)
> {
> int ret;
>
> ret = s390_ccw_clear(sch);
> if (ret == -ENOSYS) {
> sch_handle_clear_func(sch);
> + return IOINST_CC_EXPECTED;
> + }
> + /*
> + * Some conditions may have been detected prior to starting the
> clear
> + * function; map them to the correct cc.
> + */
> + switch (ret) {
> + case -ENODEV:
> + case -EACCES:
> + return IOINST_CC_NOT_OPERATIONAL;
> + default:
> + return IOINST_CC_EXPECTED;
> }
> }
>
> @@ -1265,9 +1291,9 @@ IOInstEnding
> do_subchannel_work_passthrough(SubchDev *sch)
> SCHIB *schib = &sch->curr_status;
>
> if (schib->scsw.ctrl & SCSW_FCTL_CLEAR_FUNC) {
> - sch_handle_clear_func_passthrough(sch);
> + return sch_handle_clear_func_passthrough(sch);
> } else if (schib->scsw.ctrl & SCSW_FCTL_HALT_FUNC) {
> - sch_handle_halt_func_passthrough(sch);
> + return sch_handle_halt_func_passthrough(sch);
> } else if (schib->scsw.ctrl & SCSW_FCTL_START_FUNC) {
> return sch_handle_start_func_passthrough(sch);
> }
> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> index e752c845e9e4..39275a917bd2 100644
> --- a/hw/vfio/ccw.c
> +++ b/hw/vfio/ccw.c
> @@ -199,7 +199,7 @@ again:
// This is for CLEAR
> case 0:
> case -ENODEV:
> case -EACCES:
> - return 0;
> + return ret;
> case -EFAULT:
> default:
> sch_gen_unit_exception(sch);
> @@ -240,7 +240,7 @@ again:
// This is for HALT
> case -EBUSY:
> case -ENODEV:
> case -EACCES:
> - return 0;
> + return ret;
Aside: How could we get EACCES for either HALT or CLEAR? I only see
that set in the normal request path, if we got a CC3 on the SSCH.
Can we scrub them, or do we need to update kernel
Documentation/s390/vfio-ccw.rst ? :)
Eric
> case -EFAULT:
> default:
> sch_gen_unit_exception(sch);