[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [Qemu-devel] [PATCH v4 8/8] hw/arm/boot: Expose the PC-DI
From: |
Shameerali Kolothum Thodi |
Subject: |
Re: [Qemu-arm] [Qemu-devel] [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM nodes in the DT |
Date: |
Wed, 8 May 2019 10:30:34 +0000 |
Hi Laszlo,
> -----Original Message-----
> From: Laszlo Ersek [mailto:address@hidden
> Sent: 03 May 2019 15:14
> To: Shameerali Kolothum Thodi <address@hidden>;
> address@hidden; address@hidden; address@hidden;
> address@hidden
> Cc: address@hidden; address@hidden;
> address@hidden; Linuxarm <address@hidden>;
> address@hidden; address@hidden; xuwei (O)
> <address@hidden>
> Subject: Re: [Qemu-devel] [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM
> nodes in the DT
>
> Hi Shameer,
>
> On 05/03/19 15:35, Shameerali Kolothum Thodi wrote:
> >
> >
> >> -----Original Message-----
> >> From: Linuxarm [mailto:address@hidden On Behalf Of
> >> Shameerali Kolothum Thodi
> >> Sent: 10 April 2019 09:49
> >> To: Laszlo Ersek <address@hidden>; address@hidden;
> >> address@hidden; address@hidden; address@hidden
> >> Cc: address@hidden; address@hidden;
> >> address@hidden; Linuxarm <address@hidden>;
> >> address@hidden; address@hidden; xuwei (O)
> >> <address@hidden>
> >> Subject: RE: [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM nodes in
> >> the DT
> >>
> >>
> >>> -----Original Message-----
> >>> From: Laszlo Ersek [mailto:address@hidden
> >>> Sent: 09 April 2019 16:09
> >>> To: Shameerali Kolothum Thodi
> >>> <address@hidden>;
> >>> address@hidden; address@hidden;
> address@hidden;
> >>> address@hidden
> >>> Cc: address@hidden; address@hidden;
> >>> address@hidden; address@hidden; xuwei (O)
> >>> <address@hidden>; address@hidden; Linuxarm
> >>> <address@hidden>
> >>> Subject: Re: [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM nodes in
> >>> the DT
> >>>
> >>> On 04/09/19 12:29, Shameer Kolothum wrote:
> >>>> This patch adds memory nodes corresponding to PC-DIMM regions.
> >>>> This will enable support for cold plugged device memory for Guests
> >>>> with DT boot.
> >>>>
> >>>> Signed-off-by: Shameer Kolothum
> >> <address@hidden>
> >>>> Signed-off-by: Eric Auger <address@hidden>
> >>>> ---
> >>>> hw/arm/boot.c | 42
> ++++++++++++++++++++++++++++++++++++++++++
> >>>> 1 file changed, 42 insertions(+)
> >>>>
> >>>> diff --git a/hw/arm/boot.c b/hw/arm/boot.c index 8c840ba..150e1ed
> >>>> 100644
> >>>> --- a/hw/arm/boot.c
> >>>> +++ b/hw/arm/boot.c
> >>>> @@ -19,6 +19,7 @@
> >>>> #include "sysemu/numa.h"
> >>>> #include "hw/boards.h"
> >>>> #include "hw/loader.h"
> >>>> +#include "hw/mem/memory-device.h"
> >>>> #include "elf.h"
> >>>> #include "sysemu/device_tree.h"
> >>>> #include "qemu/config-file.h"
> >>>> @@ -538,6 +539,41 @@ static void fdt_add_psci_node(void *fdt)
> >>>> qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn); }
> >>>>
> >>>> +static int fdt_add_hotpluggable_memory_nodes(void *fdt,
> >>>> + uint32_t acells,
> >>> uint32_t scells) {
> >>>> + MemoryDeviceInfoList *info, *info_list =
> qmp_memory_device_list();
> >>>> + MemoryDeviceInfo *mi;
> >>>> + int ret = 0;
> >>>> +
> >>>> + for (info = info_list; info != NULL; info = info->next) {
> >>>> + mi = info->value;
> >>>> + switch (mi->type) {
> >>>> + case MEMORY_DEVICE_INFO_KIND_DIMM:
> >>>> + {
> >>>> + PCDIMMDeviceInfo *di = mi->u.dimm.data;
> >>>> +
> >>>> + ret = fdt_add_memory_node(fdt, acells, di->addr, scells,
> >>>> + di->size, di->node, true);
> >>>> + if (ret) {
> >>>> + fprintf(stderr,
> >>>> + "couldn't add PCDIMM
> >> /address@hidden"PRIx64"
> >>> node\n",
> >>>> + di->addr);
> >>>> + goto out;
> >>>> + }
> >>>> + break;
> >>>> + }
> >>>> + default:
> >>>> + fprintf(stderr, "%s memory nodes are not yet
> supported\n",
> >>>> + MemoryDeviceInfoKind_str(mi->type));
> >>>> + ret = -ENOENT;
> >>>> + goto out;
> >>>> + }
> >>>> + }
> >>>> +out:
> >>>> + qapi_free_MemoryDeviceInfoList(info_list);
> >>>> + return ret;
> >>>> +}
> >>>> +
> >>>> int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
> >>>> hwaddr addr_limit, AddressSpace *as) { @@
> -637,6
> >>>> +673,12 @@ int arm_load_dtb(hwaddr addr, const struct
> >>> arm_boot_info *binfo,
> >>>> }
> >>>> }
> >>>>
> >>>> + rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
> >>>> + if (rc < 0) {
> >>>> + fprintf(stderr, "couldn't add hotpluggable memory nodes\n");
> >>>> + goto fail;
> >>>> + }
> >>>> +
> >>>> rc = fdt_path_offset(fdt, "/chosen");
> >>>> if (rc < 0) {
> >>>> qemu_fdt_add_subnode(fdt, "/chosen");
> >>>>
> >>>
> >>>
> >>> Given patches #7 and #8, as I understand them, the firmware cannot
> >>> distinguish hotpluggable & present, from hotpluggable & absent. The
> >> firmware
> >>> can only skip both hotpluggable cases. That's fine in that the
> >>> firmware will
> >> hog
> >>> neither type -- but is that OK for the OS as well, for both ACPI
> >>> boot and DT boot?
> >>
> >> Right. This only handles the hotpluggable-and-present condition.
> >>
> >>> Consider in particular the "hotpluggable & present, ACPI boot" case.
> >> Assuming
> >>> we modify the firmware to skip "hotpluggable" altogether, the UEFI
> >>> memmap will not include the range despite it being present at boot.
> >>> Presumably, ACPI will refer to the range somehow, however. Will that not
> confuse the OS?
> >>
> >> From my testing so far, without patches #7 and #8(ie, no UEFI memmap
> >> entry), ACPI boots fine. I think ACPI only relies on aml and SRAT.
> >>
> >>> When Igor raised this earlier, I suggested that
> >>> hotpluggable-and-present should be added by the firmware, but also
> >>> allocated immediately, as EfiBootServicesData type memory. This will
> >>> prevent other drivers in the firmware from allocating AcpiNVS or
> >>> Reserved chunks from the same memory range, the UEFI memmap will
> >>> contain the range as EfiBootServicesData, and then the OS can release
> that allocation in one go early during boot.
> >>
> >> Ok. Agree that hotpluggable-and-present case it may make sense to
> >> make use of that memory rather than just hiding it altogether.
> >>
> >> On another note, Does hotpluggable-and-absent case make any valid use
> >> case for a DT boot? I am not sure how we can hot-add memory in the
> >> case of DT boot later.
> >> I have not verified the sysfs probe interface yet and there are
> >> discussions of dropping that interface altogether.
> >>
> >>> But this really has to be clarified from the Linux kernel's
> >>> expectations. Please formalize all of the following cases:
> >>
> >> Sure. I will wait for suggestions here and work on it.
> >
> > To continue the discussion on this, this is my proposal,
> >
> > [...]
>
> I didn't miss your last update, on 10 April. The reason I didn't respond then
> was
> that, the table that you create here, needs to be approved by Linux
> developers.
> In other words, the table should summarize how Linux expects DT/ACPI to look,
> for the given use cases. It's not something that I can comment on. The
> requirements come from Linux, and we should attempt (in QEMU and the fw)
> to satisfy them.
>
> If those use cases / requirements haven't been *designed* in Linux, in the
> first
> place, then the discussion belongs even more on a kernel development list. (I
> really can't say what Linux *should* expect, and even if I had input on that,
> discussing it *just* on qemu-devel would be
> futile.)
>
> I mean, considering ACPI and the UEFI memmap at least, can we take
> examples from the physical world (I guess x86 too)? What does Linux (and
> maybe Windows) expect wrt. hotpluggable memory areas, in ACPI and in the
> UEFI memmap?
>
> I find it hard to believe that these are such use cases that we have to
> *invent* now. It seems more likely that OSes already handle these use cases,
> they have expectations, and we should *collect* them.
Right. I have just sent out a mail seeking clarification on this to the wider
Linux
community (CCd all of you as well) here,
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-May/651360.html
Hope, we can come to a conclusion soon.
Thanks,
Shameer