qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 13/13] tests/acceptance: console boot tests for quanta-gsj


From: Philippe Mathieu-Daudé
Subject: Re: [PATCH v7 13/13] tests/acceptance: console boot tests for quanta-gsj
Date: Sat, 22 Aug 2020 23:40:33 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0

Hi Havard,

On 8/20/20 10:30 PM, Havard Skinnemoen wrote:
> On Thu, Aug 20, 2020 at 10:46 AM Philippe Mathieu-Daudé <f4bug@amsat.org> 
> wrote:
>>
>> On 8/20/20 6:24 PM, Havard Skinnemoen wrote:
>>> On Wed, Aug 19, 2020 at 10:29 PM Philippe Mathieu-Daudé <f4bug@amsat.org> 
>>> wrote:
>>>>
>>>> +Eric / Richard for compiler optimizations.
>>>>
>>>> On 8/20/20 3:53 AM, Havard Skinnemoen wrote:
>>>>> On Tue, Aug 11, 2020 at 8:26 PM Havard Skinnemoen
>>>>> <hskinnemoen@google.com> wrote:
>>>>>>
>>>>>> On Tue, Aug 11, 2020 at 1:48 AM Philippe Mathieu-Daudé <f4bug@amsat.org> 
>>>>>> wrote:
>>>>>>> INTERRUPTED: Test interrupted by SIGTERM
>>>>>>> Runner error occurred: Timeout reached
>>>>>>> (240.45 s)
>>>>>>>
>>>>>>> Is that expected?
>>>>>>
>>>>>> I'm not sure why it only happens when running direct kernel boot with
>>>>>> unoptimized qemu, but it seems a little happier if I enable a few more
>>>>>> peripherals that I have queued up (sd, ehci, ohci and rng), though not
>>>>>> enough.
>>>>>>
>>>>>> It still stalls for an awfully long time on "console: Run /init as
>>>>>> init process" though. I'm not sure what it's doing there. With -O2 it
>>>>>> only takes a couple of seconds to move on.
>>>>>
>>>>> So it turns out that the kernel gets _really_ sluggish when skipping
>>>>> the clock initialization normally done by the boot loader.
>>>>>
>>>>> I changed the reset value of CLKSEL like this:
>>>>>
>>>>> diff --git a/hw/misc/npcm7xx_clk.c b/hw/misc/npcm7xx_clk.c
>>>>> index 21ab4200d1..5e9849410f 100644
>>>>> --- a/hw/misc/npcm7xx_clk.c
>>>>> +++ b/hw/misc/npcm7xx_clk.c
>>>>> @@ -67,7 +67,7 @@ enum NPCM7xxCLKRegisters {
>>>>>   */
>>>>>  static const uint32_t cold_reset_values[NPCM7XX_CLK_NR_REGS] = {
>>>>>      [NPCM7XX_CLK_CLKEN1]        = 0xffffffff,
>>>>> -    [NPCM7XX_CLK_CLKSEL]        = 0x004aaaaa,
>>>>> +    [NPCM7XX_CLK_CLKSEL]        = 0x004aaba9,
>>>>>      [NPCM7XX_CLK_CLKDIV1]       = 0x5413f855,
>>>>>      [NPCM7XX_CLK_PLLCON0]       = 0x00222101 | PLLCON_LOKI,
>>>>>      [NPCM7XX_CLK_PLLCON1]       = 0x00202101 | PLLCON_LOKI,
>>>>>
>>>>> which switches the CPU core and UART to run from PLL2 instead of
>>>>> CLKREF (25 MHz).
>>>>>
>>>>> With this change, the test passes without optimization:
>>>>>
>>>>>  (02/19) 
>>>>> tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_arm_quanta_gsj_initrd:
>>>>> PASS (39.62 s)
>>>>>
>>>>> It doesn't look like this change hurts booting from the bootrom (IIUC
>>>>> the nuvoton bootblock overwrites CLKSEL anyway), but it's not super
>>>>> clean.
>>>>>
>>>>> Perhaps I should make it conditional on kernel_filename being set? Or
>>>>> would it be better to provide a write_board_setup hook for this?
>>>>
>>>> QEMU prefers to avoid ifdef'ry at all cost. However I find this
>>>> approach acceptable (anyway up to the maintainer):
>>>>
>>>> +static void npcm7xx_clk_cold_reset_fixup(NPCM7xxCLKState *s)
>>>> +{
>>>> +#ifndef __OPTIMIZE__
>>>> +    /*
>>>> +     * When built without optimization, ...
>>>> +     * so run CPU core and UART from PLL2 instead of CLKREF.
>>>> +     */
>>>> +    s->regs[NPCM7XX_CLK_CLKSEL] |= 0x103,
>>>> +#endif
>>>> +}
>>>
>>> I think this is actually a problem regardless of optimization level.
>>> Turning optimization off amplifies the problem, but the problem is
>>> still there with optimization on.
>>
>> OK, this reminds me few more details about the problem I had with the
>> raspi3 when adding the ClockPowerResetManager block.
>> Found the branch. A bit bitter/sad it was more than 1 year ago.
>>
>> So if ARM_FEATURE_GENERIC_TIMER is available, Linux polls the CNTFRQ_EL0
>> register. At that time this register were using a fixed frequency:
>>
>> #define ARM_CPU_FREQ 1000000000 /* FIXME: 1 GHz, should be configurable */
>>
>> Xilinx' fork does it this way:
>> https://github.com/Xilinx/qemu/commit/9e939b54e2d
>>
>> Now I see Andrew Jeffery fixed that in 96eec6b2b38
>> ("target/arm: Prepare generic timer for per-platform CNTFRQ")
>> adding a 'cntfrq' property, which he then sets in the Aspeed
>> machine in commit 058d095532d ("ast2600: Configure CNTFRQ at 1125MHz").
>>
>> Maybe your SoC is simply missing this property?
> 
> Hmm, it doesn't look like Cortex-A9 has this property...
> 
> Unexpected error in object_property_find() at
> /usr/local/google/home/hskinnemoen/qemu/upstream/qom/object.c:1254:
> qemu-system-arm: Property '.cntfrq' not found
> 
> However, I did notice there are a lot of constraints on input
> frequencies to the CPU and various peripherals, and many of these are
> not met by the power-on reset values.

Oh OK.

> 
> For example, the UART needs a 24 MHz input clock, but there's no way
> to generate this from the default 25 MHz reference clock. However,
> PLL2 is set up to run at 960 MHz by default (as soon as it's locked),
> with a fixed /2 divider. The UART uses a /20 divider by default, so it
> gets a 25 MHz / 20 = 1.25 MHz clock with power-on defaults. However,
> switching the source to PLL2 results in 960 MHz / 2 / 20 = 24 MHz.

Yes, thanks to Damien's work we can now use the Clock API to model
that correctly. This API is very recent, and very few devices use
it. We need to adapt them one by one.

I started with the serial:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg727975.html

> Similarly, the CPU is supposed to run at 800 MHz, but it runs at 25
> MHz with power-on defaults. PLL1 runs at 800 MHz by default, with a
> fixed /2 divider. The boot loader doubles the feedback divider so it
> turns into 1600 MHz / 2 = 800 MHz.

I noticed in my projects is that the Clock API lacks something
like a FixedDividerClock. qdev_init_clock_in() expects a callback,
but with fixed divisor we could have the divisor in a state and
use a generic callback, but the callback doesn't take pointer to
state.

Maybe that can be a new field in NamedClockList. I didn't had time
to write patches tests and ping Damien/Peter about it, but will do
during Sept.

> 
> Turns out I was wrong above, the patch snippet routes PLL1, not PLL2,
> to the CPU. But it will only result in half the recommended speed, so
> a patch to PLLCON1 is needed as well.
> 
> It seems like the cleanest solution would be to add a
> write_board_setup hook to fix up these registers so clocks are within
> normal ranges when bypassing the boot loader. In particular:
> 
>   - Switch UART to PLL2 for a 24 MHz input clock.
>   - Set up PLL1 to run at 1600 MHz.
>   - Switch the CPU core to PLL1 / 2 for a 800 MHz input clock.
> 
> Does that make sense? It should be just three simple register writes,
> which is doable with hand-edited machine code.

It makes sense but since (I guess) we could properly model that using
the Clock API, this hook is a temporary kludge...
I'm not against it however, since except this detail, this series is
almost perfect, and you already spent enough time.
IOW IMHO this kludge is acceptable for now, and modelling the missing
parts to use the Clock API can be done on top (or not...).

> 
> I'll add this as a separate patch (right before the acceptance test)
> so it's clear what's happening.
> 
> Unfortunately, I haven't been able to make the flash boot test pass
> without optimization. U-boot seems to have a tiny buffer for the
> command line, so I can only disable two or three systemd services
> before it gets truncated. I might try to create a reduced openbmc
> image instead.

I haven't think of that limitation. I rather prefer we test the
original OpenBMC image. Simply disable the 3 longest services,
and we'll adjust the test time.

Thanks for the detailed info you provided.

Regards,

Phil.

> 
> Havard
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]