qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 04/11] hw/arm: Add NPCM730 and NPCM750 SoC models


From: Philippe Mathieu-Daudé
Subject: Re: [PATCH v5 04/11] hw/arm: Add NPCM730 and NPCM750 SoC models
Date: Tue, 14 Jul 2020 19:11:04 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0

On 7/14/20 6:01 PM, Markus Armbruster wrote:
> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
> 
>> +Markus
>>
>> On 7/14/20 2:44 AM, Havard Skinnemoen wrote:
>>> On Mon, Jul 13, 2020 at 8:02 AM Cédric Le Goater <clg@kaod.org> wrote:
>>>>
>>>> On 7/9/20 2:36 AM, Havard Skinnemoen wrote:
>>>>> The Nuvoton NPCM7xx SoC family are used to implement Baseboard
>>>>> Management Controllers in servers. While the family includes four SoCs,
>>>>> this patch implements limited support for two of them: NPCM730 (targeted
>>>>> for Data Center applications) and NPCM750 (targeted for Enterprise
>>>>> applications).
>>>>>
>>>>> This patch includes little more than the bare minimum needed to boot a
>>>>> Linux kernel built with NPCM7xx support in direct-kernel mode:
>>>>>
>>>>>   - Two Cortex-A9 CPU cores with built-in periperhals.
>>>>>   - Global Configuration Registers.
>>>>>   - Clock Management.
>>>>>   - 3 Timer Modules with 5 timers each.
>>>>>   - 4 serial ports.
>>>>>
>>>>> The chips themselves have a lot more features, some of which will be
>>>>> added to the model at a later stage.
>>>>>
>>>>> Reviewed-by: Tyrone Ting <kfting@nuvoton.com>
>>>>> Reviewed-by: Joel Stanley <joel@jms.id.au>
>>>>> Signed-off-by: Havard Skinnemoen <hskinnemoen@google.com>
>>>>> ---
>> ...
>>
>>>>> +static void npcm7xx_realize(DeviceState *dev, Error **errp)
>>>>> +{
>>>>> +    NPCM7xxState *s = NPCM7XX(dev);
>>>>> +    NPCM7xxClass *nc = NPCM7XX_GET_CLASS(s);
>>>>> +    int i;
>>>>> +
>>>>> +    /* CPUs */
>>>>> +    for (i = 0; i < nc->num_cpus; i++) {
>>>>> +        object_property_set_int(OBJECT(&s->cpu[i]),
>>>>> +                                arm_cpu_mp_affinity(i, 
>>>>> NPCM7XX_MAX_NUM_CPUS),
>>>>> +                                "mp-affinity", &error_abort);
>>>>> +        object_property_set_int(OBJECT(&s->cpu[i]), 
>>>>> NPCM7XX_GIC_CPU_IF_ADDR,
>>>>> +                                "reset-cbar", &error_abort);
>>>>> +        object_property_set_bool(OBJECT(&s->cpu[i]), true,
>>>>> +                                 "reset-hivecs", &error_abort);
>>>>> +
>>>>> +        /* Disable security extensions. */
>>>>> +        object_property_set_bool(OBJECT(&s->cpu[i]), false, "has_el3",
>>>>> +                                 &error_abort);
>>>>> +
>>>>> +        qdev_realize(DEVICE(&s->cpu[i]), NULL, &error_abort);
>>>>
>>>> I would check the error:
>>>>
>>>>         if (!qdev_realize(DEVICE(&s->cpu[i]), NULL, errp)) {
>>>>             return;
>>>>         }
>>>>
>>>> same for the sysbus_realize() below.
>>>
>>> Hmm, I used to propagate these errors until Philippe told me not to
>>> (or at least that's how I understood it).
>>
>> It was before Markus simplification API were merged, you had to
>> propagate after each call, since this is a non hot-pluggable SoC
>> I suggested to use &error_abort to simplify.
>>
>>> I'll be happy to do it
>>> either way (and the new API makes it really easy to propagate errors),
>>> but I worry that I don't fully understand when to propagate errors and
>>> when not to.
>>
>> Markus explained it on the mailing list recently (as I found the doc
>> not obvious). I can't find the thread. I suppose once the work result
>> after the "Questionable aspects of QEMU Error's design" discussion is
>> merged, the documentation will be clarified.
> 
> The Error API evolved recently.  Please peruse the big comment in
> include/qapi/error.h.  If still unsure, don't hesitate to ask here.
> 
>> My rule of thumb so far is:
>> - programming error (can't happen) -> &error_abort
> 
> Correct.  Quote the big comment:
> 
>  * Call a function aborting on errors:
>  *     foo(arg, &error_abort);
>  * This is more concise and fails more nicely than
>  *     Error *err = NULL;
>  *     foo(arg, &err);
>  *     assert(!err); // don't do this
> 
>> - everything triggerable by user or management layer (via QMP command)
>>   -> &error_fatal, as we can't risk loose the user data, we need to
>>   shutdown gracefully.
> 
> Quote the big comment:
> 
>  * Call a function treating errors as fatal:
>  *     foo(arg, &error_fatal);
>  * This is more concise than
>  *     Error *err = NULL;
>  *     foo(arg, &err);
>  *     if (err) { // don't do this
>  *         error_report_err(err);
>  *         exit(1);
>  *     }
> 
> Terminating the process is generally fine during initial startup,
> i.e. before the guest runs.
> 
> It's generally not fine once the guest runs.  Errors need to be handled
> more gracefully then.  A QMP command, for instance, should fail cleanly,
> propagating the error to the monitor core, which then sends it to the
> QMP client, and loops to process the next command.
> 
>>> It makes sense to me to propagate errors from *_realize() and
>>> error_abort on failure to set simple properties, but I'd like to know
>>> if Philippe is on board with that.
> 
> Realize methods must not use &error_fatal.  Instead, they should clean
> up and fail.
> 
> "Clean up" is the part we often neglect.  The big advantage of
> &error_fatal is that you don't have to bother :)
> 
> Questions?

One on my side. So in this realize(), all &error_abort uses has
to be replaced by local_err + propagate ...:

static void npcm7xx_realize(DeviceState *dev, Error **errp)
{
    NPCM7xxState *s = NPCM7XX(dev);
    NPCM7xxClass *nc = NPCM7XX_GET_CLASS(s);
    int i;

    /* CPUs */
    for (i = 0; i < nc->num_cpus; i++) {
        object_property_set_int(OBJECT(&s->cpu[i]),
                                arm_cpu_mp_affinity(i,
NPCM7XX_MAX_NUM_CPUS),
                                "mp-affinity", &error_abort);
        object_property_set_int(OBJECT(&s->cpu[i]), NPCM7XX_GIC_CPU_IF_ADDR,
                                "reset-cbar", &error_abort);
        object_property_set_bool(OBJECT(&s->cpu[i]), true,
                                 "reset-hivecs", &error_abort);

        /* Disable security extensions. */
        object_property_set_bool(OBJECT(&s->cpu[i]), false, "has_el3",
                                 &error_abort);

        qdev_realize(DEVICE(&s->cpu[i]), NULL, &error_abort);
    }
    [...]

... but the caller does:

static void quanta_gsj_init(MachineState *machine)
{
    NPCM7xxState *soc;

    soc = npcm7xx_create_soc(machine, QUANTA_GSJ_POWER_ON_STRAPS);
    npcm7xx_connect_dram(soc, machine->ram);
    qdev_realize(DEVICE(soc), NULL, &error_abort);
                                    ^^^^^^^^^^^^
    npcm7xx_load_kernel(machine, soc);
}

So we overload the code...

My question: Do you confirm this is worth it to propagate?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]