qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] hw/cxl: Fix msix_notify: Assertion `vector < dev->msix_en


From: Jonathan Cameron
Subject: Re: [PATCH v2] hw/cxl: Fix msix_notify: Assertion `vector < dev->msix_entries_nr`
Date: Wed, 15 Jan 2025 13:50:11 +0000

On Wed, 15 Jan 2025 11:22:14 +0000
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Wed, 15 Jan 2025 07:48:59 +0000
> "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com> wrote:
> 
> > On 10/01/2025 23:44, Jonathan Cameron wrote:  
> > > On Fri, 13 Dec 2024 17:36:02 +0800
> > > Li Zhijian <lizhijian@fujitsu.com> wrote:
> > >     
> > >> This assertion always happens when we sanitize the CXL memory device.
> > >> $ echo 1 > /sys/bus/cxl/devices/mem0/security/sanitize
> > >>
> > >> It is incorrect to register an MSIX number beyond the device's 
> > >> capability.
> > >>
> > >> Expand the device's MSIX number and use the enum to maintain the *USED*
> > >> and MAX MSIX number
> > >>
> > >> Fixes: 43efb0bfad2b ("hw/cxl/mbox: Wire up interrupts for background 
> > >> completion")
> > >> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> > >> ---
> > >> V2: just increase msix number and add enum to maintainer their values #
> > >> Jonathan    
> > > 
> > > Ah. Sorry I was unclear. Two patches please
> > > 
> > > 1. Make the number bigger to fix the bug. Only this one gets a fixes tag 
> > > and
> > >     is suitable for backporting.
> > > 
> > > 2. Add an enum including all numbers currently used and use that 
> > > throughout the
> > >     type3 related code. That will prevent use accidentally introducing the
> > >     bug in future but doesn't need to be backported.
> > >       
> > 
> > Understood, it make sense.
> > 
> > 
> >   
> > > A few other comments inline.
> > > 
> > > Thanks
> > > 
> > > Jonathan
> > >     
> > >> ---
> > >>   hw/cxl/cxl-device-utils.c   |  6 ++----
> > >>   hw/mem/cxl_type3.c          | 10 +++++-----
> > >>   include/hw/cxl/cxl_device.h |  7 +++++++
> > >>   3 files changed, 14 insertions(+), 9 deletions(-)
> > >>
> > >> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > >> index 035d034f6d..bc2171e3d4 100644
> > >> --- a/hw/cxl/cxl-device-utils.c
> > >> +++ b/hw/cxl/cxl-device-utils.c
> > >> @@ -354,8 +354,6 @@ static void device_reg_init_common(CXLDeviceState 
> > >> *cxl_dstate)
> > >>   
> > >>   static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
> > >>   {
> > >> -    const uint8_t msi_n = 9;
> > >> -
> > >>       /* 2048 payload size */
> > >>       ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> > >>                        PAYLOAD_SIZE, CXL_MAILBOX_PAYLOAD_SHIFT);
> > >> @@ -364,8 +362,8 @@ static void mailbox_reg_init_common(CXLDeviceState 
> > >> *cxl_dstate)
> > >>       ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> > >>                        BG_INT_CAP, 1);
> > >>       ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> > >> -                     MSI_N, msi_n);
> > >> -    cxl_dstate->mbox_msi_n = msi_n;
> > >> +                     MSI_N, CXL_MSIX_MBOX);    
> > > 
> > > Should be passed in from the type 3 specific call so add a parameter to 
> > > this
> > > function and pass this from cxl_device_register_init_t3.
> > > Even better pass it into there from ct3d_reset()
> > >     
> > 
> > At a glance, `ct3d_reset()` has the following prototype: `typedef void 
> > (*DeviceReset)(DeviceState *dev)`,
> > which is inherited from the QEMU device framework. Consequently, it is hard 
> > to extend `ct3d_reset()`
> > to include an additional parameter.  
> 
> I wasn't clear in that statement.   ct3d_reset() calls 
> cxl_device_register_init_t3(). Change
> the signature of cxl_device_register_init_t3() so that takes the appropriate 
> MSI index.
> that puts the source of the information in the cxl_type3.c code, not the core 
> support in hw / CXL.
> No change to ct3d_reset() signature needed.
> 
> > 
> > 
> >   
> > > Will potentially be a different number for the switch CCI passed in from
> > > the call of cxl_device_register_init_swcci() in switch-mailbox-cci.c    
> > 
> > It sounds reasonable, offering a more flexible design for the future.
> > 
> > Currently, mailbox_reg_init_common() will be called from type3 device and 
> > swcci,
> > however, I didn't see any where the swcci itself have setup the msi/msix at 
> > all.
> > 
> > Is this expected, feel free to let me know if I'm missing something.  
> Currently that is in cxl_device_register_init_swcci() via
> mailbox_reg_init_common()
> Suggestion is to like the above type 3 case move it into the specific driver
> by having cxl_device_register_init_swcci() take it as a parameter and
> pass that in from cswmbcci_reset() in switch-mailbox-cci.c

Ah.  I now see what you mean - no msix bar setup.
I guess the switch cci support has accidentally been falling back to polled 
mode.  
That needs fixing but is a separate issue.

Jonathan

> 
> 
> >   
> > > 
> > >     
> > >> +    cxl_dstate->mbox_msi_n = CXL_MSIX_MBOX;
> > >>       ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> > >>                        MBOX_READY_TIME, 0); /* Not reported */
> > >>       ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> > >> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> > >> index 5cf754b38f..f2f060ed9e 100644
> > >> --- a/hw/mem/cxl_type3.c
> > >> +++ b/hw/mem/cxl_type3.c
> > >> @@ -843,7 +843,6 @@ static void ct3_realize(PCIDevice *pci_dev, Error 
> > >> **errp)
> > >>       ComponentRegisters *regs = &cxl_cstate->crb;
> > >>       MemoryRegion *mr = &regs->component_registers;
> > >>       uint8_t *pci_conf = pci_dev->config;
> > >> -    unsigned short msix_num = 6;
> > >>       int i, rc;
> > >>       uint16_t count;
> > >>   
> > >> @@ -884,16 +883,17 @@ static void ct3_realize(PCIDevice *pci_dev, Error 
> > >> **errp)
> > >>                        &ct3d->cxl_dstate.device_registers);
> > >>   
> > >>       /* MSI(-X) Initialization */
> > >> -    rc = msix_init_exclusive_bar(pci_dev, msix_num, 4, NULL);
> > >> +    rc = msix_init_exclusive_bar(pci_dev, CXL_MSIX_MAX, 4, NULL);
> > >>       if (rc) {
> > >>           goto err_address_space_free;
> > >>       }
> > >> -    for (i = 0; i < msix_num; i++) {
> > >> +    for (i = 0; i < CXL_MSIX_MAX; i++) {
> > >>           msix_vector_use(pci_dev, i);
> > >>       }
> > >>   
> > >>       /* DOE Initialization */
> > >> -    pcie_doe_init(pci_dev, &ct3d->doe_cdat, 0x190, doe_cdat_prot, true, 
> > >> 0);
> > >> +    pcie_doe_init(pci_dev, &ct3d->doe_cdat, 0x190, doe_cdat_prot, true,
> > >> +                  CXL_MSIX_PCIE_DOE);
> > >>   
> > >>       cxl_cstate->cdat.build_cdat_table = ct3_build_cdat_table;
> > >>       cxl_cstate->cdat.free_cdat_table = ct3_free_cdat_table;
> > >> @@ -908,7 +908,7 @@ static void ct3_realize(PCIDevice *pci_dev, Error 
> > >> **errp)
> > >>       if (rc) {
> > >>           goto err_release_cdat;
> > >>       }
> > >> -    cxl_event_init(&ct3d->cxl_dstate, 2);
> > >> +    cxl_event_init(&ct3d->cxl_dstate, CXL_MSIX_EVENT_START);
> > >>   
> > >>       /* Set default value for patrol scrub attributes */
> > >>       ct3d->patrol_scrub_attrs.scrub_cycle_cap =
> > >> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > >> index 561b375dc8..3f89b041ce 100644
> > >> --- a/include/hw/cxl/cxl_device.h
> > >> +++ b/include/hw/cxl/cxl_device.h
> > >> @@ -133,6 +133,13 @@ typedef enum {
> > >>       CXL_MBOX_MAX = 0x20
> > >>   } CXLRetCode;
> > >>   
> > >> +enum {    
> > > 
> > > Maybe worth naming these to be type3 specific.    
> > 
> > It sounds good to me.
> > 
> >   
> > >     
> > >> +    CXL_MSIX_PCIE_DOE = 0,    
> > > Name it to include that this is specifically the DOE for the table access 
> > > protocol.
> > > 
> > >     CXL_MSIX_PCIE_DOE_TABLE_ACCESS    
> > 
> > make sense.
> > 
> >   
> > > 
> > > 
> > > This should be private to cxl_type3.c which should be possible by passing
> > > it to a few more calls from there.
> > >     
> > 
> > If we make the entire enumeration `cxl_type3` private, it appears 
> > unnecessary to pass it through several more calls.  
> 
> Key is to make sure it is in cxl_type3.c  For that to work you need to pass 
> it as
> described above so that we can set the relevant values in various registers
> configured by the hw/cxl/* code.
> 
> > 
> > How about this
> > 
> > +/* type3 device private */
> > +enum CXL_T3_MSIX_VECTOR {
> > +    CXL_T3_MSIX_PCIE_DOE_TABLE_ACCESS = 0,  
> 
> What is on 1?  We don't need to maintain backwards compatibility (migration
> is broken in lots of other ways and not high on priority list to fix) so 
> should
> be no need to skip it. If you really want to add a reserved entry and we will
> fill it in later.
> 
> > +    CXL_T3_MSIX_EVENT_START = 2,
> > +    CXL_T3_MSIX_MBOX = CXL_T3_MSIX_EVENT_START + CXL_EVENT_TYPE_MAX,
> > +    CXL_T3_MSIX_VECTOR_NR
> > +};
> > +
> >   
> Hope that helps,
> 
> Jonathan
> 
> > 
> > Thanks
> > Zhijian
> >   
> > >> +    CXL_MSIX_EVENT_START = 2,
> > >> +    CXL_MSIX_MBOX = CXL_MSIX_EVENT_START + CXL_EVENT_TYPE_MAX,
> > >> +    CXL_MSIX_MAX
> > >> +};
> > >> +
> > >>   typedef struct CXLCCI CXLCCI;
> > >>   typedef struct cxl_device_state CXLDeviceState;
> > >>   struct cxl_cmd;    
> > >    
> 
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]