qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ISSUE] memdev cannot be enabled after reboot due to failed dvsec ra


From: Fan Ni
Subject: Re: [ISSUE] memdev cannot be enabled after reboot due to failed dvsec range check [QEMU setup]
Date: Wed, 15 Jan 2025 23:02:32 +0000

On Wed, Jan 15, 2025 at 01:06:24AM +0000, Zhijian Li (Fujitsu) wrote:
> Cced QEMU,
> 
> Hi Fan,
> 
> I recalled we had a reboot issue[1] months ago
> I guess your issue was caused by some registers not reset during reboot.
> 
> [1] 
> https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/
> 
Hi Zhijian,
Thanks for the pointer. With the fix applied, the issue goes away.

Fan
> 
> On 15/01/2025 04:30, Fan Ni wrote:
> > Hi,
> > 
> > Recently, while testing cxl with qemu setup, I found the memdev cannot
> > be enabled successfully after reboot.
> > 
> > Here is the setup and the steps I have tried.
> > 
> > QEMU:
> > https://gitlab.com/qemu-project/qemu.git
> > branch: master
> > commit: 8032c78e556cd0baec111740a6c636863f9bd7c8
> > 
> > Kernel:
> > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/
> > branch: next
> > 2f84d072bdcb7d6ec66cc4d0de9f37a3dc394cd2
> > 
> > Steps to reproduce the issue.
> > 1.  start the vm with cxl pmem device attached directly to RP.
> > 2. Load the cxl drivers cxl_acpi cxl_core cxl_pci cxl_port cxl_mem, etc.
> > Everyting works expected, the memory is corrected enabled and shown with
> > cxl list.
> > 3. Reboot the VM (run reboot command inside vm, no shutdown);
> > 4. Load the cxl drivers as in step 2. the cxl pmem is not correctly enabled.
> > 
> > dmesg shows some error as below:
> > -------------------------------
> > [   17.131729] cxl_core:cxl_hdm_decode_init:443: cxl_pci 0000:0d:00.0: 
> > DVSEC Range0 denied by platform
> > [   17.135267] cxl_pci 0000:0d:00.0: Range register decodes outside 
> > platform defined CXL ranges.
> > [   17.138428] cxl_core:cxl_bus_probe:2073: cxl_port endpoint2: probe: -6
> > [   17.141104] cxl_core:devm_cxl_add_port:936: cxl_mem mem0: endpoint2 
> > added to port1
> > [   17.143703] cxl_mem mem0: endpoint2 failed probe
> > [   17.145324] cxl_core:cxl_bus_probe:2073: cxl_mem mem0: probe: -6
> > [   17.171416] cxl_core:cxl_detach_ep:1499: cxl_mem mem0: disconnect mem0 
> > from port1
> > ------------------------------
> > Compare the step 2 and 4 with debug info. we can see,
> > In step 2, when entry function: cxl_hdm_decode_init().
> > 
> > (gdb) p *info
> > $2 = {mem_enabled = false, ranges = 0, port = 0xffff8881097eac00, 
> > dvsec_range = {{start = 0, end = 0}, {start = 0, end = 0}}}
> > 
> > The info struct is from cxl_dvsec_rr_decode(), where if mem_enabled is
> > not enabled, it will return directly without reading dvsec range, so
> > ranges == 0.
> > This is what happened in step 2: no dvsec ranges are provided to the 
> > function for checking.
> > 
> > When init the hdm decoder in cxl_hdm_decode_init function, the memory 
> > enable bit will be set.
> > 
> > In step 4, after reboot, the enabled memory enable bit sustained and the 
> > dvsec range
> > register will be read from the device in cxl_dvsec_rr_decode.
> > So when entrying cxl_hdm_decode_init(),
> > ------------------------------------
> > $2 = {mem_enabled = true, ranges = 1, port = 0xffff888103c77400, 
> > dvsec_range = {{start = 0, end = 536870911}, {start = 0, end = 0}}}
> > Breakpoint 2 at 0xffffffffc0657bbe: file drivers/cxl/core/pci.c, line 416.
> > ------------------------------------
> > It will cause the dvsec_range_allowed() failing as the range from dvsec 
> > range
> > registers starts at address zero [0, 512], which does not match the hpa 
> > range
> > stored in cxld->hpa_range, causing the issue.
> > 
> > ------------------------------------
> > Thread 1 hit Breakpoint 4, dvsec_range_allowed (dev=0xffff888108af9848,
> >      arg=0xffffc9000059f9b0) at drivers/cxl/core/pci.c:265
> > 265         if (!(cxld->flags & CXL_DECODER_F_RAM))
> > (gdb) b 268
> > Breakpoint 5 at 0xffffffffc0657d31: file drivers/cxl/core/pci.c, line 271.
> > (gdb) p /x cxld->hpa_range
> > $5 = {start = 0xa90000000, end = 0xb8fffffff}
> > (gdb) p /x *dev_range
> > $7 = {start = 0x0, end = 0x1fffffff}
> > (gdb)
> > ------------------------------------
> > The hpa_range is set when parsing the cfmws in __cxl_parse_cfmws.
> > 
> > Any throughts?
> > 
> > Open question: do we need to update the dvsec range register after we parse 
> > the
> > cfmws to make the two above match.

-- 
Fan Ni (From gmail)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]