qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/3] Reorg ppc64 pmu insn counting


From: Daniel Henrique Barboza
Subject: Re: [PATCH 0/3] Reorg ppc64 pmu insn counting
Date: Mon, 3 Jan 2022 15:06:18 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0



On 1/3/22 12:07, Alex Bennée wrote:

Daniel Henrique Barboza <danielhb413@gmail.com> writes:

On 12/23/21 00:01, Richard Henderson wrote:
In contrast to Daniel's version, the code stays in power8-pmu.c,
but is better organized to not take so much overhead.
Before:
      32.97%  qemu-system-ppc  qemu-system-ppc64   [.] pmc_get_event
      20.22%  qemu-system-ppc  qemu-system-ppc64   [.] helper_insns_inc
       4.52%  qemu-system-ppc  qemu-system-ppc64   [.] hreg_compute_hflags_value
       3.30%  qemu-system-ppc  qemu-system-ppc64   [.] helper_lookup_tb_ptr
       2.68%  qemu-system-ppc  qemu-system-ppc64   [.] tcg_gen_code
       2.28%  qemu-system-ppc  qemu-system-ppc64   [.] cpu_exec
       1.84%  qemu-system-ppc  qemu-system-ppc64   [.] pmu_insn_cnt_enabled
After:
       8.42%  qemu-system-ppc  qemu-system-ppc64   [.]
hreg_compute_hflags_value
       6.65%  qemu-system-ppc  qemu-system-ppc64   [.] cpu_exec
       6.63%  qemu-system-ppc  qemu-system-ppc64   [.] helper_insns_inc


Thanks for looking this up. I had no idea the original C code was that slow.

<snip>

With that in mind I decided to post a new version of my TCG rework, with less 
repetition and
a bit more concise, to have an alternative that can be used upstream to fix the 
Avocado tests.
Meanwhile I'll see if I can get your reorg working with all EBB tests we need. 
All things
equal - similar performance, all EBB tests passing - I'd rather stay with your 
C code than my
TCG rework since yours doesn't rely on TCG Ops knowledge to maintain
it.

Reading this series did make me wonder if we need a more generic service
from the TCG for helping with "internal" instrumentation needed for
things like decent PMU emulation. We haven't gone as much for it in ARM
yet but it would be nice to. It would be even nicer if such a facility
could be used by stuff like icount as well so we don't end up doing the
same thing twice.

Back in May 2021 when I first starting working on this code I tried to base 
myself in the
ARM PMU code. In fact, the cycle and insn calculation done in the very first 
version of
this work was based on what ARM does in target/arm/helper.c, cycles_get_count() 
and
instructions_get_count(). The cycle calculation got simplified because our 
PPC64 CPU
has a 1Ghz clock so it's easier to just consider 1ns = 1 cycle.

For instruction count, aside from my 2-3 weeks of spectacular failures trying 
to count
instructions inside translate.c, I also looked into how TCG plugins work and 
tried to do
something similar to what plugin_gen_tb_end() does at the end of the 
translator_loop()
in accel/tcg/translator.c. For some reason I wasn't able to replicate the same 
behavior
that I would have if I used the TCG plugin framework in the 'canonical' way.

I ended up doing something similar to what instructions_get_count() from ARM 
does, which
relies on icount. Richard then aided me in figuring out that I could count 
instructions
directly by tapping into the end of each TB.

So, for a generic service of sorts I believe it would be nice to re-use the TCG 
plugins
API in the internal instrumentation (I tried it once, failed, not sure if I 
messed up
or it's not possible ATM). That would be a good start to try to get all this 
logic in a
generic code for internal translate code to use.



Thanks,


Daniel






Thanks,


Daniel


[1] 
https://github.com/torvalds/linux/tree/master/tools/testing/selftests/powerpc/pmu/ebb
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-12/msg00073.html

r~
Richard Henderson (3):
    target/ppc: Cache per-pmc insn and cycle count settings
    target/ppc: Rewrite pmu_increment_insns
    target/ppc: Use env->pnc_cyc_cnt
   target/ppc/cpu.h         |   3 +
   target/ppc/power8-pmu.h  |  14 +--
   target/ppc/cpu_init.c    |   1 +
   target/ppc/helper_regs.c |   2 +-
   target/ppc/machine.c     |   2 +
   target/ppc/power8-pmu.c  | 230 ++++++++++++++++-----------------------
   6 files changed, 108 insertions(+), 144 deletions(-)






reply via email to

[Prev in Thread] Current Thread [Next in Thread]