[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: qemu-system-ppc64 abort()s with pcie bridges
From: |
Greg Kurz |
Subject: |
Re: qemu-system-ppc64 abort()s with pcie bridges |
Date: |
Thu, 9 Jul 2020 11:28:01 +0200 |
On Wed, 8 Jul 2020 11:57:03 +0200
Greg Kurz <groug@kaod.org> wrote:
> On Wed, 8 Jul 2020 10:03:47 +0200
> Thomas Huth <thuth@redhat.com> wrote:
>
> >
> > Hi,
> >
> > qemu-system-ppc64 currently abort()s when it is started with a pcie
> > bridge device:
> >
> > $ qemu-system-ppc64 -M pseries-5.1 -device pcie-pci-bridge
> > Unexpected error in object_property_find() at qom/object.c:1240:
> > qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' not found
> > Aborted (core dumped)
> >
> > or:
> >
> > $ qemu-system-ppc64 -M pseries -device dec-21154-p2p-bridge
> > Unexpected error in object_property_find() at qom/object.c:1240:
> > qemu-system-ppc64: -device dec-21154-p2p-bridge: Property '.chassis_nr'
> > not found
> > Aborted (core dumped)
> >
> > That's kind of ugly, and it shows up as error when running
> > scripts/device-crash-test. Is there an easy way to avoid the abort() and
> > fail more gracefully here?
> >
>
> And even worse, this can tear down a running guest with hotplug :\
>
> (qemu) device_add pcie-pci-bridge
> Unexpected error in object_property_find() at
> /home/greg/Work/qemu/qemu-ppc/qom/object.c:1240:
> Property '.chassis_nr' not found
> Aborted (core dumped)
>
> This is caused by recent commit:
>
> commit 7ef1553dac8ef8dbe547b58d7420461a16be0eeb
> Author: Markus Armbruster <armbru@redhat.com>
> Date: Tue May 5 17:29:25 2020 +0200
>
> spapr_pci: Drop some dead error handling
>
> chassis_from_bus() uses object_property_get_uint() to get property
> "chassis_nr" of the bridge device. Failure would be a programming
> error. Pass &error_abort, and simplify its callers.
>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: qemu-ppc@nongnu.org
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> Acked-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Greg Kurz <groug@kaod.org>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> Message-Id: <20200505152926.18877-18-armbru@redhat.com>
>
> Before that, we would simply print the "chassir_nr not found" error,
> and in case of a cold plugged device exit.
>
> The root cause is that the sPAPR PCI code assumes that a PCI bridge
> has a "chassir_nr" property, ie. it is a standard PCI bridge. Other
> PCI bridge types don't have that. Not sure yet why this information
> is required, I'll check LoPAPR.
>
More on this side : each slot of a PCI bridge is associated a DRC (a
PAPR thingy to handle hot plug/unplug). Each DRC must have a unique
identifier system-wide. We used to use the bus number to compute
the DRC id but it was broken, so we now _hijack_ "chassis_nr" as an
alternative since this commit:
commit 05929a6c5dfe1028ef66250b7bbf11939f8e77cd
Author: David Gibson <david@gibson.dropbear.id.au>
Date: Wed Apr 10 11:49:28 2019 +1000
spapr: Don't use bus number for building DRC ids
This means that we only support the standard pci-bridge device,
and this relies on the availability of "chassis_nr". Failure
to find this property is then not a programming error, but
an expected case where we want to fail gracefully (ie. revert
Markus's commit mentioned above).
While reading code I realized that we have another problem : the
realization of the pci-bridge device does fail if "chassis_nr" is
zero, but I failed to find a uniqueness check. And we get:
$ qemu-system-ppc64 -device pci-bridge,chassis_nr=1 -device
pci-bridge,chassis_nr=1
Unexpected error in object_property_try_add() at qom/object.c:1167:
qemu-system-ppc64: -device pci-bridge,chassis_nr=1: attempt to add duplicate
property '40000100' to object (type 'container')
Aborted (core dumped)
It is very confusing to see that we state that "chassis_nr" is unique
several times in slotid_cap_init() but it is never enforced anywhere.
if (!chassis) {
error_setg(errp, "Bridge chassis not specified. Each bridge is required"
" to be assigned a unique chassis id > 0.");
return -EINVAL;
}
or
/* We make each chassis unique, this way each bridge is First in Chassis */
Michael, Marcel or anyone with PCI knowledge,
Can you shed some light on the semantics of "chassis_nr" ?
> In the meantime, since we're in soft freeze, I guess we should
> revert Markus's patch and add a big fat comment to explain
> what's going on and maybe change the error message to something
> more informative, eg. "PCIE-to-PCI bridges are not supported".
>
> Thoughts ?
>
> > Thomas
> >
>