[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-discuss] Question regarding tracing guest execution using qemu
From: |
Peter Maydell |
Subject: |
Re: [Qemu-discuss] Question regarding tracing guest execution using qemu-system-sparc |
Date: |
Fri, 2 Jun 2017 15:55:07 +0100 |
On 2 June 2017 at 12:55, Ronald De Keulenaer
<address@hidden> wrote:
> I was wondering if any of you have any experience with using Qemu for
> tracing execution of guest instructions under full system emulation.
> Specifically, I am trying to obtain an execution trace using
> qemu-system-sparc.
This is moderately difficult because QEMU is optimised for
going fast, not for providing clear visibility into what
the guest code is doing. Tracing and debug logging tends to
be "log the things it's easy for QEMU to log", and is much easier
to interpret if you have a basic idea of what is going on under the hood.
> diff -urN a/target/sparc/translate.c b/target/sparc/translate.c
> --- a/target/sparc/translate.c 2017-04-20 16:57:01.000000000 +0200
> +++ b/target/sparc/translate.c 2017-05-31 16:49:09.621150561 +0200
> @@ -3175,6 +3175,9 @@
> TCGv_i64 cpu_src1_64, cpu_src2_64, cpu_dst_64;
> target_long simm;
>
> + printf("{__trace.ins @ (0x)%08x}\n", dc->pc);
> + fflush(stdout);
> +
> opc = GET_FIELD(insn, 0, 1);
> rd = GET_FIELD(insn, 2, 6);
>
>
>
> This initially seemed to output correct control flow for very simple
> programs, but I soon noticed that traces were incomplete (e.g. an
> instruction at address X seems to be executed far less than normal) and
> incorrect (e.g. control flow seems to skip over some function calls
> completely)
You're printing the PC in a function which is called at translate
time, which means that it is called the first time that code is
executed, but it won't be called on subsequent times that guest
code is executed (because we don't retranslate, we just get the
previous translation from the cache).
If you want to print information when code is executed it needs
to be done elsewhere, but there are some obstacles:
* by default we will link translated code blocks directly together,
which means that there is no C code in QEMU which gets executed
before the 2nd TB in a chain runs and nowhere to put printing;
you need to disable this optimization if you care about that
* execution of the start of a TB doesn't necessarily mean we
actually executed the whole of it -- we might take an exception
halfway through (or abandon the idea of executing it before we
start, in order to take an interrupt)
* any tracing you do at runtime will make QEMU go pretty slowly
> Note that I am aware of existing tracing functionality within Qemu! I have
> tried enabling the simple tracing backend by running ./configure with
> --trace-backend=simple, as per the documentation in /docs/tracing.txt, but I
> can't seem to find any events that report the PC, or anything I can use, in
> resulting trace files.
> For sure, if I can use the simple tracing backend to achieve the desired
> result, that's an equally good solution. I just want a trace that's correct.
I'd suggest playing with the -d option to enable some simple forms
of execution tracing. (See also -D to specify a file to write to).
For instance, -d exec,cpu,nochain will:
* print info about execution
* print a CPU register state dump for every TB
* disable the TB-chaining optimization that otherwise confuses things
and you get logs like
Trace 0x7fb084fcf028 [0: 00000000]
pc: 00000000 npc: 00000004
%g0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%o0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%l0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%i0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
%f08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
%f16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
%f24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
psr: 040000c0 (icc: ---- SPE: SP-) wim: 00000001
fsr: 00000000 y: 00000000
Stopped execution of TB chain before 0x7fb084fcf028 [00000000]
Trace 0x7fb084fcf028 [0: 00000000]
pc: 00000000 npc: 00000004
%g0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%o0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%l0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%i0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
%f08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
%f16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
%f24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
psr: 040000c0 (icc: ---- SPE: SP-) wim: 00000001
fsr: 00000000 y: 00000000
Trace 0x7fb084fcf060 [0: 00005dec]
pc: 00005dec npc: 00005df0
%g0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%o0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%l0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%i0-7: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
%f00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
%f08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
%f16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
%f24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
psr: 040000c0 (icc: ---- SPE: SP-) wim: 00000001
fsr: 00000000 y: 00000000
where the "Trace" lines indicate a TB about to execute, with the guest
PC in the [] part, and the "Stopped execution..." line for instance
shows where we decided not to actually run the TB (which is why the
next Trace line is the same pc==0 again).
If you don't specify 'nochain' then the exec tracing will also tell you
when each TB is linked together, which you can sort of use to after-the-fact
figure out what goes on when a chain of TBs is executed. The 'nochain'
trace is simpler to understand, though, so I recommend that.
'cpu' slows things down a lot because of the volume of output compared
to plain 'exec'.
You can sidestep the "what about exceptions in the middle of a TB"
problem a bit with the -singlestep option which puts 1 insn only in
each TB, but this also slows things down massively.
https://translatedcode.wordpress.com/2015/07/06/tricks-for-debugging-qemu-savevm-snapshots/
describes a trick you can do with savevm snapshots to allow you to
run fast without tracing enabled up to a point of interest, then take
a snapshot, and restart QEMU with tracing enabled from that point.
thanks
-- PMM