qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Enabling internal errors for VH CXL devices: [was: Re: Questions abo


From: Dan Williams
Subject: RE: Enabling internal errors for VH CXL devices: [was: Re: Questions about CXL RAS injection test in qemu]
Date: Wed, 6 Mar 2024 09:16:07 -0800

[ add Li Ming ]

Jonathan Cameron wrote:
[..]
> Robert / Terry, I tracked down the patch where you enabled this for RCHs and 
> there was
> some discussion on walking out on VH as well to enable this, but seems it
> never happened. Can you remember why?  Just kicked back for a future occasion?
> 

Li Ming has this patch below waiting in wings. Li Ming, this patch is
timely for this dicussion, care to send out the full series? I expect it
needs to be an RFC given concerns with integrating with the pending port
switch error handling work.

-- 8< --
From: Li Ming <ming4.li@intel.com>
Subject: [PATCH RFC v3 3/6] PCI/AER: Enable RCEC to report internal error for 
CXL root port
Date: Thu, 1 Feb 2024 05:58:08 +0000

Per CXL r3.1 section 12.2.2, RCEC is possible to log the CXL.cachemem
protocol errors detected by CXL root port as PCI_ERR_UNC_INTN or
PCI_ERR_COR_INTERNAL in AER Capability. So unmask PCI_ERR_UNC_INTN and
PCI_ERR_COR_INTERNAL for that case.

Signed-off-by: Li Ming <ming4.li@intel.com>
---
 drivers/pci/pcie/aer.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 42a3bd35a3e1..ef8fd77cb920 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -985,7 +985,7 @@ static bool cxl_error_is_native(struct pci_dev *dev)
 {
        struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 
-       return (pcie_ports_native || host->native_aer);
+       return (pcie_ports_native || host->native_aer) && host->is_cxl;
 }
 
 static bool is_internal_error(struct aer_err_info *info)
@@ -1041,8 +1041,14 @@ static int handles_cxl_error_iter(struct pci_dev *dev, 
void *data)
 {
        bool *handles_cxl = data;
 
-       if (!*handles_cxl)
-               *handles_cxl = is_cxl_mem_dev(dev) && cxl_error_is_native(dev);
+       if (!*handles_cxl) {
+               if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END &&
+                   is_cxl_mem_dev(dev) && cxl_error_is_native(dev))
+                       *handles_cxl = true;
+               if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT &&
+                   cxl_error_is_native(dev))
+                       *handles_cxl = true;
+       }
 
        /* Non-zero terminates iteration */
        return *handles_cxl;
@@ -1054,13 +1060,18 @@ static bool handles_cxl_errors(struct pci_dev *rcec)
 
        if (pci_pcie_type(rcec) == PCI_EXP_TYPE_RC_EC &&
            pcie_aer_is_native(rcec))
-               pcie_walk_rcec(rcec, handles_cxl_error_iter, &handles_cxl);
+               pcie_walk_rcec_all(rcec, handles_cxl_error_iter, &handles_cxl);
 
        return handles_cxl;
 }
 
-static void cxl_rch_enable_rcec(struct pci_dev *rcec)
+static void cxl_enable_rcec(struct pci_dev *rcec)
 {
+       /*
+        * Enable RCEC's internal error report for two cases:
+        * 1. RCiEP detected CXL.cachemem protocol errors
+        * 2. CXL root port detected CXL.cachemem protocol errors.
+        */
        if (!handles_cxl_errors(rcec))
                return;
 
@@ -1069,7 +1080,7 @@ static void cxl_rch_enable_rcec(struct pci_dev *rcec)
 }
 
 #else
-static inline void cxl_rch_enable_rcec(struct pci_dev *dev) { }
+static inline void cxl_enable_rcec(struct pci_dev *dev) { }
 static inline void cxl_rch_handle_error(struct pci_dev *dev,
                                        struct aer_err_info *info) { }
 #endif
@@ -1494,7 +1505,7 @@ static int aer_probe(struct pcie_device *dev)
                return status;
        }
 
-       cxl_rch_enable_rcec(port);
+       cxl_enable_rcec(port);
        aer_enable_rootport(rpc);
        pci_info(port, "enabled with IRQ %d\n", dev->irq);
        return 0;
-- 
2.40.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]