|
From: | Peter Jin |
Subject: | Re: [PATCH v2] s390x/css: revert SCSW ctrl/flag bits on error |
Date: | Mon, 31 Oct 2022 10:44:46 -0400 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.0 |
To everyone else on the mailing lists,I apologize for any confusion caused by this being a "v2" of the original "v1" patch and not submitting a changelog. I am rather new to these kinds of mailing lists, though Eric and Matt here have been really helpful in getting this correct. I will remember to put changelogs for v2, v3, etc. patches before submitting them!
Anyways, changes from v1: - no code changes, only added additional explanatory text. If you think I need to post a v3 to put this inline, let me know. Peter On 10/31/22 10:08, Matthew Rosato wrote:
On 10/28/22 4:22 PM, Eric Farman wrote:On Thu, 2022-10-27 at 23:23 +0200, Peter Jin wrote:Revert the control and flag bits in the subchannel status word in case the SSCH operation fails with non-zero CC (ditto for CSCH and HSCH). According to POPS, the control and flag bits are only changed if SSCH, CSCH, and HSCH return CC 0, and no other action should be taken otherwise. In order to simulate that after the fact, the bits need to be reverted on non-zero CC.I'm okay to this point...This change is necessary due to the fact that the pwrite() in vfio- ccw which triggers the SSCH can fail at any time. Previously, there was only virtio-ccw, whose do_subchannel_work function was only able to return CC0. However, once vfio-ccw went into the mix, it has become necessary to handle errors in code paths that were previously assumed to always return success. In our case, we found that in case of pwrite() failure (which was discovered by strace injection), the subchannel could be stuck in start pending state, which could be problematic if the pwrite() call returns CC2. Experimentation shows that the guest tries to retry the SSCH call as normal for CC2, but it actually continously fails due to the fact that the subchannel is stuck in start pending state even though no start function is actually taking place....but the two paragraphs above are a bit cumbersome to digest. Maybe it's just too late in the week for me. What about something like this? """ While the do_subchannel_work logic for virtual (virtio) devices will return condition code 0, passthrough (vfio) devices may encounter errors from either the host kernel or real hardware that need to be accounted for after this point. This includes restoring the state of the Subchannel Status Word to reflect the subchannel, as these bits would not be set in the event of a non-zero condition code from the affected instructions. Experimentation has shown that a failure on a START SUBCHANNEL (SSCH) to a passthrough device would leave the subchannel with the START PENDING activity control bit set, thus blocking subsequent SSCH operations in css_do_ssch() until some form of error recovery was undertaken since no interrupt would be expected. """+1 to this re-writeSigned-off-by: Peter Jin <pjin@linux.ibm.com>We've talked previously about clearing this within the do_subchannel_work_passthrough routine in order to keep the _virtual paths untouched, but this seems like a reasonable approach to me. The commit message is probably fine either way, but as far as the code goes: Reviewed-by: Eric Farman <farman@linux.ibm.com>Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Thanks Peter!--- hw/s390x/css.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++- -- 1 file changed, 48 insertions(+), 3 deletions(-) diff --git a/hw/s390x/css.c b/hw/s390x/css.c index 7d9523f811..95d1b3a3ce 100644 --- a/hw/s390x/css.c +++ b/hw/s390x/css.c @@ -1522,21 +1522,37 @@ IOInstEnding css_do_xsch(SubchDev *sch) IOInstEnding css_do_csch(SubchDev *sch) { SCHIB *schib = &sch->curr_status; + uint16_t old_scsw_ctrl; + IOInstEnding ccode;if (~(schib->pmcw.flags) & (PMCW_FLAGS_MASK_DNV |PMCW_FLAGS_MASK_ENA)) { return IOINST_CC_NOT_OPERATIONAL; }+ /*+ * Save the current scsw.ctrl in case CSCH fails and we need + * to revert the scsw to the status quo ante. + */ + old_scsw_ctrl = schib->scsw.ctrl; + /* Trigger the clear function. */ schib->scsw.ctrl &= ~(SCSW_CTRL_MASK_FCTL | SCSW_CTRL_MASK_ACTL); schib->scsw.ctrl |= SCSW_FCTL_CLEAR_FUNC | SCSW_ACTL_CLEAR_PEND;- return do_subchannel_work(sch);+ ccode = do_subchannel_work(sch); + + if (ccode != IOINST_CC_EXPECTED) { + schib->scsw.ctrl = old_scsw_ctrl; + } + + return ccode; }IOInstEnding css_do_hsch(SubchDev *sch){ SCHIB *schib = &sch->curr_status; + uint16_t old_scsw_ctrl; + IOInstEnding ccode;if (~(schib->pmcw.flags) & (PMCW_FLAGS_MASK_DNV |PMCW_FLAGS_MASK_ENA)) { return IOINST_CC_NOT_OPERATIONAL; @@ -1553,6 +1569,12 @@ IOInstEnding css_do_hsch(SubchDev *sch) return IOINST_CC_BUSY; }+ /*+ * Save the current scsw.ctrl in case HSCH fails and we need + * to revert the scsw to the status quo ante. + */ + old_scsw_ctrl = schib->scsw.ctrl; + /* Trigger the halt function. */ schib->scsw.ctrl |= SCSW_FCTL_HALT_FUNC; schib->scsw.ctrl &= ~SCSW_FCTL_START_FUNC; @@ -1564,7 +1586,13 @@ IOInstEnding css_do_hsch(SubchDev *sch) } schib->scsw.ctrl |= SCSW_ACTL_HALT_PEND;- return do_subchannel_work(sch);+ ccode = do_subchannel_work(sch); + + if (ccode != IOINST_CC_EXPECTED) { + schib->scsw.ctrl = old_scsw_ctrl; + } + + return ccode; }static void css_update_chnmon(SubchDev *sch)@@ -1605,6 +1633,8 @@ static void css_update_chnmon(SubchDev *sch) IOInstEnding css_do_ssch(SubchDev *sch, ORB *orb) { SCHIB *schib = &sch->curr_status; + uint16_t old_scsw_ctrl, old_scsw_flags; + IOInstEnding ccode;if (~(schib->pmcw.flags) & (PMCW_FLAGS_MASK_DNV |PMCW_FLAGS_MASK_ENA)) { return IOINST_CC_NOT_OPERATIONAL; @@ -1626,11 +1656,26 @@ IOInstEnding css_do_ssch(SubchDev *sch, ORB *orb) } sch->orb = *orb; sch->channel_prog = orb->cpa; + + /* + * Save the current scsw.ctrl and scsw.flags in case SSCH fails and we need + * to revert the scsw to the status quo ante. + */ + old_scsw_ctrl = schib->scsw.ctrl; + old_scsw_flags = schib->scsw.flags; + /* Trigger the start function. */ schib->scsw.ctrl |= (SCSW_FCTL_START_FUNC | SCSW_ACTL_START_PEND); schib->scsw.flags &= ~SCSW_FLAGS_MASK_PNO;- return do_subchannel_work(sch);+ ccode = do_subchannel_work(sch); + + if (ccode != IOINST_CC_EXPECTED) { + schib->scsw.ctrl = old_scsw_ctrl; + schib->scsw.flags = old_scsw_flags; + } + + return ccode; }static void copy_irb_to_guest(IRB *dest, const IRB *src, const PMCW*pmcw,
[Prev in Thread] | Current Thread | [Next in Thread] |