[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public
From: |
Alex Bennée |
Subject: |
Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public |
Date: |
Wed, 29 Jun 2022 19:24:02 +0100 |
User-agent: |
mu4e 1.7.27; emacs 28.1.50 |
Cédric Le Goater <clg@kaod.org> writes:
> On 6/29/22 16:14, Alex Bennée wrote:
>> Cédric Le Goater <clg@kaod.org> writes:
>>
>>> On 6/24/22 18:50, Cédric Le Goater wrote:
>>>> On 6/23/22 20:43, Peter Delevoryas wrote:
>>>>>
>>>>>
>>>>>> On Jun 23, 2022, at 8:09 AM, Cédric Le Goater <clg@kaod.org> wrote:
>>>>>>
>>>>>> On 6/23/22 12:26, Peter Delevoryas wrote:
>>>>>>> Signed-off-by: Peter Delevoryas <pdel@fb.com>
>>>>>>
>>>>>> Let's start simple without flash support. We should be able to
>>>>>> load FW blobs in each CPU address space using loader devices.
>>>>>
>>>>> Actually, I was unable to do this, perhaps because the fb OpenBMC
>>>>> boot sequence is a little weird. I specifically _needed_ to have
>>>>> a flash device which maps the firmware in at 0x2000_0000, because
>>>>> the fb OpenBMC U-Boot SPL jumps to that address to start executing
>>>>> from flash? I think this is also why fb OpenBMC machines can be so slow.
>>>>>
>>>>> $ ./build/qemu-system-arm -machine fby35 \
>>>>> -device loader,file=fby35.mtd,addr=0,cpu-num=0 -nographic \
>>>>> -d int -drive file=fby35.mtd,format=raw,if=mtd
>>>> Ideally we should be booting from the flash device directly using
>>>> the machine option '-M ast2600-evb,execute-in-place=true' like HW
>>>> does. Instructions are fetched using SPI transfers. But the amount
>>>> of code generated is tremendous.
>> Yeah because there is a potential race when reading from HW so we
>> throw
>> away TB's after executing them because we have no way of knowing if it
>> has changed under our feet. See 873d64ac30 (accel/tcg: re-factor non-RAM
>> execution code) which cleaned up this handling.
>>
>>>> See some profiling below for a
>>>> run which barely reaches DRAM training in U-Boot.
>>>
>>> Some more profiling on both ast2500 and ast2600 machines shows :
>>>
>>>
>>> * ast2600-evb,execute-in-place=true :
>>>
>>> Type Object Call site Wait Time (s)
>>> Count Average (us)
>>> ---------------------------------------------------------------------------------------------
>>> BQL mutex 0x564dc03922e0 accel/tcg/cputlb.c:1365 14.21443
>>> 32909927 0.43
>> This is unavoidable as a HW access needs the BQL held so we will go
>> through this cycle every executed instruction.
>> Did I miss why the flash contents are not mapped into the physical
>> address space? Isn't that how it appear to the processor?
>
>
> There are two modes :
> if (ASPEED_MACHINE(machine)->mmio_exec) {
> memory_region_init_alias(boot_rom, NULL, "aspeed.boot_rom",
> &fl->mmio, 0, size);
> memory_region_add_subregion(get_system_memory(), FIRMWARE_ADDR,
> boot_rom);
> } else {
> memory_region_init_rom(boot_rom, NULL, "aspeed.boot_rom",
> size, &error_abort);
> memory_region_add_subregion(get_system_memory(), FIRMWARE_ADDR,
> boot_rom);
> write_boot_rom(drive0, FIRMWARE_ADDR, size, &error_abort);
> }
>
> The default boot mode uses the ROM. No issue.
>
> The "execute-in-place=true" option creates an alias on the region of
> the flash contents and each instruction is then fetched from the flash
> drive with SPI transactions.
>
> With old FW images, using an older U-boot, the machine boots in a couple
> of seconds. See the profiling below for a witherspoon-bmc machine using
> U-Boot 2016.07.
>
> qemu-system-arm -M witherspoon-bmc,execute-in-place=true -drive
> file=./flash-witherspoon-bmc,format=raw,if=mtd -drive
> file=./flash-witherspoon-bmc2,format=raw,if=mtd -nographic -nodefaults
> -snapshot -serial mon:stdio -enable-sync-profile
> ...
> U-Boot 2016.07-00040-g8425e96e2e27-dirty (Jun 24 2022 - 23:21:57 +0200)
> Watchdog enabled
> DRAM: 496 MiB
> Flash: 32 MiB
> In: serial
> Out: serial
> Err: serial
> Net:
> (qemu) info sync-profile
> Type Object Call site Wait Time (s)
> Count Average (us)
>
> ---------------------------------------------------------------------------------------------
> BQL mutex 0x56189610b2e0 accel/tcg/cputlb.c:1365 0.25311
> 12346237 0.02
> condvar 0x5618970cf220 softmmu/cpus.c:423 0.05506
> 2 27530.78
> BQL mutex 0x56189610b2e0 util/rcu.c:269 0.04709
> 2 23544.26
> condvar 0x561896d0fc78 util/thread-pool.c:90 0.01340
> 83 161.47
> condvar 0x56189610b240 softmmu/cpus.c:571 0.00005
> 1 54.93
> condvar 0x56189610b280 softmmu/cpus.c:642 0.00003
> 1 32.88
> BQL mutex 0x56189610b2e0 util/main-loop.c:318 0.00003
> 34 0.76
> mutex 0x561896eade00 tcg/region.c:204 0.00002
> 995 0.02
> rec_mutex [ 2] util/async.c:682 0.00002
> 493 0.03
> mutex [ 2] chardev/char.c:118 0.00001
> 404 0.03
>
> ---------------------------------------------------------------------------------------------
>
>
> However, with recent U-boots, it takes quite a while to reach DRAM training.
> Close to a minute. See the profiling below for an ast2500-evb machine using
> U-Boot 2019.04.
>
> qemu-system-arm -M ast2500-evb,execute-in-place=true -net
> nic,macaddr=C0:FF:EE:00:00:03,netdev=net0 -drive
> file=./flash-ast2500-evb,format=raw,if=mtd -nographic -nodefaults -snapshot
> -serial mon:stdio -enable-sync-profile
> qemu-system-arm: warning: Aspeed iBT has no chardev backend
> qemu-system-arm: warning: nic ftgmac100.1 has no peer
> QEMU 7.0.50 monitor - type 'help' for more information
> U-Boot 2019.04-00080-g6ca27db3f97b-dirty (Jun 24 2022 - 23:22:03
> +0200)
> SOC : AST2500-A1
> RST : Power On
> LPC Mode : SIO:Enable : SuperIO-2e
> Eth : MAC0: RGMII, , MAC1: RGMII,
> Model: AST2500 EVB
> DRAM: 448 MiB (capacity:512 MiB, VGA:64 MiB, ECC:off)
> MMC: sdhci_slot0@100: 0, sdhci_slot1@200: 1
> Loading Environment from SPI Flash... SF: Detected mx25l25635e with page
> size 256 Bytes, erase size 64 KiB, total 32 MiB
> *** Warning - bad CRC, using default environment
> In: serial@1e784000
> Out: serial@1e784000
> Err: serial@1e784000
> Net: eth0: ethernet@1e660000
> Warning: ethernet@1e680000 (eth1) using random MAC address -
> 4a:e5:9a:4a:c7:c5
> , eth1: ethernet@1e680000
> Hit any key to stop autoboot: 2
> (qemu) info sync-profile
> Type Object Call site Wait Time (s)
> Count Average (us)
>
> ---------------------------------------------------------------------------------------------
> condvar 0x561f10c9ef88 util/thread-pool.c:90 10.01196
> 28 357570.00
> BQL mutex 0x561f102362e0 accel/tcg/cputlb.c:1365 0.29496
> 14248621 0.02
> condvar 0x561f110325a0 softmmu/cpus.c:423 0.02231
> 2 11152.57
> BQL mutex 0x561f102362e0 util/rcu.c:269 0.01447
> 4 3618.60
> condvar 0x561f10236240 softmmu/cpus.c:571 0.00010
> 1 102.19
> mutex 0x561f10e9f1c0 tcg/region.c:204 0.00007
> 3052 0.02
> mutex [ 2] chardev/char.c:118 0.00003
> 1486 0.02
> condvar 0x561f10236280 softmmu/cpus.c:642 0.00003
> 1 29.38
> BQL mutex 0x561f102362e0 accel/tcg/cputlb.c:1426 0.00002
> 973 0.02
> BQL mutex 0x561f102362e0 util/main-loop.c:318 0.00001
> 34 0.41
>
> ---------------------------------------------------------------------------------------------
> Something in the layout of the FW is making a big difference. One
> that could be relevant is that the recent versions are using a device
> tree.
>
> There might be no good solution to this issue but I fail to analyze
> it correctly. Is there a way to collect information on the usage of
> Translation Blocks ?
You could expand the data we collect in tb_tree_stats and expose it via
info jit.
>
> Thanks,
>
> C.
--
Alex Bennée
- Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public, Cédric Le Goater, 2022/06/23
- Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public, Peter Delevoryas, 2022/06/23
- Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public, Cédric Le Goater, 2022/06/24
- Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public, Cédric Le Goater, 2022/06/29
- Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public, Alex Bennée, 2022/06/29
- Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public, Cédric Le Goater, 2022/06/29
- Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public,
Alex Bennée <=
- Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public, Cédric Le Goater, 2022/06/30
- Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public, Alex Bennée, 2022/06/30