qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: notdirty_write thrashing in simple for loop


From: BALATON Zoltan
Subject: Re: notdirty_write thrashing in simple for loop
Date: Sun, 23 May 2021 19:30:56 +0200 (CEST)

On Sun, 23 May 2021, Mark Watson wrote:
Hi

On Sun, 23 May 2021 at 15:41, BALATON Zoltan <balaton@eik.bme.hu> wrote:
I think you need to be more specific about details or even better provide
a way to reproduce it without your hardware if possible otherwise people
will not get what you're seeing. From the above it's not clear to me if
you're emulating an fpga hardware in QEMU or actually run with the fpga
(supposedly implementing the Amiga chipset) in the virtual machine's
memory so accesses to some addresses will do something in hardware (in
which case it may be difficult to reproduce without it and also could be
the source of problems so hard to tell what might be causing your issue.)

I managed to reproduce now locally without the fpga, on my x86 system.

The issue seems to be the layout of where the Amiga puts code and the
stack. It does not use virtual memory and each program seems to get the
stack just below the code. So whenever the code increments i, it writes to
the page and qemu does a lookup in a map to potentially invalidate the
code.

That's probably enough for people who can give advice to understand the problem. I think I get it but can't help more as I don't know TCG or QEMU internals very well.

(Is this related to pistorm or something based on that for full Amiga
emulation without Amiga hardware? Just insterested, unrelated to this
thread.)

The minimig is a recreation of the Amiga hardware in the FPGA. In addition
to its own dedicated board, it has been ported to many boards: Turbo
Chameleon, Mist, MiSTer (DE10 Nano with expansion). In the MiSTer an SOC
FPGA chip is used, which has dual arm codes and an fpga on the same
silicon, with high performance bridges beween them.
Pistorm and Buffee are fairly similar, in that they are replacing the 68K
cpu with an emulated cpu, but with intefaces to real hardware. As I
undetstand it, the former uses Musashi and the latter they are writing
their own JIT.

I see. Interesting idea if you already have such an fpga SoC, then you can make good use of the ARM cores that way.

So you do nothing in the loop just test for the loop variable and this
sometimes runs slow?

Yes in fact even without the test in the loop. Just a loop incrementing i,
where i is on the stack. As I now found out it seems to be an issue if the
code and the variable i are in the same page.

Now I could try to modify the software on the amiga to split stack and
code. I do wonder if some kind of caching layer could be added to qemu so
that repeated invalidates do not take so much cpu time.

I don't know but added maintainers of accel/tcg/cputlb.c to cc to get their attention. You can get this info from MAINTAINERS file and more easily with:

scripts/get_maintainer.pl -f accel/tcg/cputlb.c

For reference and more backfround info here's a link to Mark's original message:

https://lists.nongnu.org/archive/html/qemu-devel/2021-05/msg05581.html

Regards,
BALATON Zoltan

Also verify that these excessive calls to notdirty_write does
only happen when it's running slow so it's really the source of the
problems and not something normal otherwise.

I have now confirmed this, I enable the trace_event on the notdirty to
confirm.

Many thanks for the qemu and dgb debugging tips, much appreciated. I will
real them.

Mark Watson




reply via email to

[Prev in Thread] Current Thread [Next in Thread]