qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TCG change broke MorphOS boot on sam460ex


From: BALATON Zoltan
Subject: Re: TCG change broke MorphOS boot on sam460ex
Date: Thu, 4 Apr 2024 00:23:09 +0200 (CEST)

On Wed, 3 Apr 2024, Nicholas Piggin wrote:
On Tue Apr 2, 2024 at 9:32 PM AEST, BALATON Zoltan wrote:
On Thu, 21 Mar 2024, BALATON Zoltan wrote:
On 27/2/24 17:47, BALATON Zoltan wrote:
Hello,

Commit 18a536f1f8 (accel/tcg: Always require can_do_io) broke booting
MorphOS on sam460ex (this was before 8.2.0 and I thought I've verified it
before that release but apparently missed it back then). It can be
reproduced with https://www.morphos-team.net/morphos-3.18.iso and following
command:

qemu-system-ppc -M sam460ex -serial stdio -d unimp,guest_errors \
   -drive if=none,id=cd,format=raw,file=morphos-3.18.iso \
   -device ide-cd,drive=cd,bus=ide.1

Any idea on this one? While MorphOS boots on other machines and other OSes
seem to boot on this machine it may still suggest there's some problem
somewhere as this worked before. So it may worth investigating it to make
sure there's no bug that could affect other OSes too even if they boot. I
don't know how to debug this so some help would be needed.

In the bad case it crashes after running this TB:

----------------
IN:
0x00c01354:  38c00040  li       r6, 0x40
0x00c01358:  38e10204  addi     r7, r1, 0x204
0x00c0135c:  39010104  addi     r8, r1, 0x104
0x00c01360:  39410004  addi     r10, r1, 4
0x00c01364:  39200000  li       r9, 0
0x00c01368:  7cc903a6  mtctr    r6
0x00c0136c:  84c70004  lwzu     r6, 4(r7)
0x00c01370:  7cc907a4  tlbwehi  r6, r9
0x00c01374:  84c80004  lwzu     r6, 4(r8)
0x00c01378:  7cc90fa4  tlbwelo  r6, r9
0x00c0137c:  84ca0004  lwzu     r6, 4(r10)
0x00c01380:  7cc917a4  tlbwehi  r6, r9
0x00c01384:  39290001  addi     r9, r9, 1
0x00c01388:  4200ffe4  bdnz     0xc0136c
----------------
IN:
0x00c01374: unable to read memory
----------------

"unable to read memory" is the tracer, it does actually translate
the address, but it points to a wayward real address which returns
0 to TCG, which is an invalid instruction.

The good case instead doesn't exit the TB after 0x00c01370 but after
the complete loop at the bdnz. That look like this after the same
first TB:

----------------
IN:
0x00c0136c:  84c70004  lwzu     r6, 4(r7)
0x00c01370:  7cc907a4  tlbwehi  r6, r9
0x00c01374:  84c80004  lwzu     r6, 4(r8)
0x00c01378:  7cc90fa4  tlbwelo  r6, r9
0x00c0137c:  84ca0004  lwzu     r6, 4(r10)
0x00c01380:  7cc917a4  tlbwehi  r6, r9
0x00c01384:  39290001  addi     r9, r9, 1
0x00c01388:  4200ffe4  bdnz     0xc0136c
----------------
IN:
0x00c0138c:  4c00012c  isync

All the tlbwe are executed in the same TB. MMU tracing shows the
first tlbwehi creates a new valid(!) TLB for 0x00000000-0x100000000
that has a garbage RPN because the tlbwelo did not run yet.

What's happening in the bad case is that the translator breaks
and "re-fetches" instructions in the middle of that sequence, and
that's where the bogus translation causes 0 to be returned. The
good case the whole block is executed in the same fetch which
creates correct translations.

So it looks like a morphos bug, the can-do-io change just happens
to cause it to re-fetch in that place, but that could happen for
a number of reasons, so you can't rely on TLB *only* changing or
ifetch *only* re-fetching at a sync point like isync.

Thanks a lot for the analysis. Probably ir works on real machine due to cache effects so maybe it was just luck this did not break.

I would expect code like this to write an invalid entry with tlbwehi,
then tlbwelo to set the correct RPN, then make the entry valid with
the second tlbwehi. It would probably fix the bug if you just did the
first tlbwehi with r6=0 (or at least without the 0x200 bit set).

I think I had to fix a similar issue in AROS years ago when I've first tried to make sam460ex emulation work and used AROS for testing:
https://github.com/aros-development-team/AROS/commit/586a8ada8a5b861a77cab177d39e01de8c3f4cf5

I can't fix MorphOS as it's not open source but hope MorphOS people will get to know about this and do something with it. It still works better on other emulated machines such as pegasos2 and mac99 so it's not a big deal, just wanted to make sure it would not be a bug that could affect other OSes on sam460ex.

Thank you,
BALATON Zoltan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]