[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-discuss] Debugging qemu inside chroot
From: |
Peter Maydell |
Subject: |
Re: [Qemu-discuss] Debugging qemu inside chroot |
Date: |
Thu, 28 Mar 2019 10:20:42 +0000 |
On Thu, 28 Mar 2019 at 10:10, Paulo Matos <address@hidden> wrote:
>
>
>
> On 21/03/2019 09:15, Peter Maydell wrote:
> >> Then I attempted to run gdb outside of the chroot like:
> >> gdb --args qemu-3.1.0-install/bin/qemu-aarch64
> >> aarch64-chroot/racket/racket/bin/racket3m ...
> >
> > If you want to run x86 gdb on QEMU (to debug/get backtraces
> > for QEMU itself) you can do something like:
> >
> > gdb --args chroot aarch64-chroot /usr/bin/qemu-aarch64-static
> > path/to/racket3m ...
> >
> > gdb will of course start out by debugging the host chroot binary,
> > but you can use 'break main' which should cause gdb to ask
> > "Make breakpoint pending on future shared library load?" -- if you
> > say yes, then continue, it will break when chroot execs QEMU
> > and we enter QEMU's main program. (Won't work if you have debug
> > symbols for chroot for some reason -- if so then use a bp on
> > some function that's only in QEMU and not in chroot.)
> >
>
> Thanks for your help.
> At this point I get this from debugging qemu-aarch64-static:
> process 4399 is executing new program:
> /root/aarch64-chroot/usr/bin/qemu-aarch64-static
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> [New Thread 0x7ffff7ff9700 (LWP 4403)]
>
> Thread 1 "mksystem.rkt" received signal SIGSEGV, Segmentation fault.
> 0x0000000060b6f9a9 in static_code_gen_buffer ()
> (gdb) bt
> #0 0x0000000060b6f9a9 in static_code_gen_buffer ()
> #1 0x000000006004abc4 in cpu_tb_exec (cpu=0x6296ee30,
> itb=0x60b6f8c0 <static_code_gen_buffer+2357056>)
> at /root/qemu-3.1.0/accel/tcg/cpu-exec.c:171
> #2 0x000000006004b807 in cpu_loop_exec_tb (cpu=0x6296ee30,
> tb=0x60b6f8c0 <static_code_gen_buffer+2357056>, last_tb=0x7fffffffdbd8,
> tb_exit=0x7fffffffdbd0) at /root/qemu-3.1.0/accel/tcg/cpu-exec.c:615
> #3 0x000000006004baa0 in cpu_exec (cpu=0x6296ee30)
> at /root/qemu-3.1.0/accel/tcg/cpu-exec.c:725
> #4 0x0000000060084eef in cpu_loop (env=0x629770e0)
> at /root/qemu-3.1.0/linux-user/aarch64/cpu_loop.c:82
> #5 0x000000006005a042 in main (argc=10, argv=0x7fffffffe508,
> envp=0x7fffffffe560)
> at /root/qemu-3.1.0/linux-user/main.c:819
This backtrace says "the segfault happened in code that
QEMU generated" (because the faulting PC is in 'static_code_gen_buffer',
which is where we put the x86 code we generate for the guest
binary). This usually means "it's a segfault in the guest".
I think at this point I would take the "use QEMU's gdbstub
and an aarch64-aware debugger to try to debug the guest"
and see if the backtrace and information you get there
suggest what's happening.
> so I looked for logging and tried:
> chroot aarch64-chroot/ qemu-aarch64-static -d
> out_asm,in_asm,cpu,guest_errors /racket/racket/src/build/racket/racket3m
> ...
>
> which dumps a massive log ending with:
>
> IN:
> 0x40011309e0: 2a0003e4 mov w4, w0
> 0x40011309e4: d2800040 movz x0, #0x2
> 0x40011309e8: aa0503e1 mov x1, x5
> 0x40011309ec: d2800002 movz x2, #0
> 0x40011309f0: d2800103 movz x3, #0x8
> 0x40011309f4: d28010e8 movz x8, #0x87
> 0x40011309f8: d4000001 svc #0
>
> OUT: [size=115]
> 0x61560900: 41 8b 6e e4 movl -0x1c(%r14), %ebp
> 0x61560904: 85 ed testl %ebp, %ebp
> 0x61560906: 0f 8c 5d 00 00 00 jl 0x61560969
> 0x6156090c: 49 8b 6e 40 movq 0x40(%r14), %rbp
> 0x61560910: 8b ed movl %ebp, %ebp
> 0x61560912: 49 89 6e 60 movq %rbp, 0x60(%r14)
> 0x61560916: 49 c7 46 40 02 00 00 00 movq $2, 0x40(%r14)
> 0x6156091e: 49 8b 6e 68 movq 0x68(%r14), %rbp
> 0x61560922: 49 89 6e 48 movq %rbp, 0x48(%r14)
> 0x61560926: 49 c7 46 50 00 00 00 00 movq $0, 0x50(%r14)
> 0x6156092e: 49 c7 46 58 08 00 00 00 movq $8, 0x58(%r14)
> 0x61560936: 49 c7 86 80 00 00 00 87 movq $0x87, 0x80(%r14)
> 0x6156093e: 00 00 00
> 0x61560941: 48 bd fc 09 13 01 40 00 movabsq $0x40011309fc, %rbp
> 0x61560949: 00 00
> 0x6156094b: 49 89 ae 40 01 00 00 movq %rbp, 0x140(%r14)
> 0x61560952: 49 8b fe movq %r14, %rdi
> 0x61560955: be 02 00 00 00 movl $2, %esi
> 0x6156095a: ba 00 00 00 56 movl $0x56000000, %edx
> 0x6156095f: b9 01 00 00 00 movl $1, %ecx
> 0x61560964: e8 bb 48 b6 fe callq 0x600c5224
> 0x61560969: b8 43 08 56 61 movl $0x61560843, %eax
> 0x6156096e: e9 a5 06 3d ff jmp 0x60931018
>
> PC=00000040011309e0 X00=0000000000000000 X01=000000000000113a
> X02=0000000000000006 X03=000000000000113a X04=0000004001210bd8
> X05=00000040012f92a0 X06=ffffffffffffffff X07=ffffffffffffffff
> X08=0000000000000083 X09=ffffffffffffffff X10=ffffffffffffffff
> X11=ffffffffffffffff X12=ffffffffffffffff X13=ffffffffffffffff
> X14=0000000000000000 X15=00000000000001fc X16=0000004001131cd0
> X17=0000004000781298 X18=0000000000000000 X19=0000000000000006
> X20=000000400124a000 X21=00000040012459d0 X22=0000000000000000
> X23=0000000000000000 X24=0000000000000000 X25=0000000000000000
> X26=0000000000000000 X27=0000000000000000 X28=0000000000000000
> X29=00000040012f9280 X30=0000004001130988 SP=00000040012f9280
> PSTATE=00000000 ---- S EL0t
> qemu: uncaught target signal 6 (Aborted) - core dumped
> Aborted (core dumped)
>
> So, am I correct in thinking that qemu converts guest assembly (IN -
> aarch64) into host assembly (OUT - x86_64) and then executes somehow on
> the host this OUT assembly?
Yes. (In other words, we do just-in-time translation -- JITting --
of the guest binary.)
> So above, qemu crashes at PC 0x40011309e0 which corresponds to guest:
> mov w4, w0
No, this isn't correct. You can from the gdb backtrace that
QEMU crashed at host PC 0x0000000060b6f9a9, which is not in
the translation block that you quote above. The TB we were
in when we crashed will have been translated a bit further up
in the logs, I expect.
QEMU caches translation blocks, so the last block translated
is not necessarily the last block executed. You can get QEMU
to tell you what blocks it is executing by adding 'cpu,exec'
to the set of -d flags, but this will make execution rather
slower. I think this is likely to be a more longwinded way of
debugging further than using the QEMU gdbstub.
> At this point I should mention I don't know much about aarch64 but I
> didn't think it had wX registers.
In AArch64, wN is "the bottom 32 bits of register xN". So
mov w4, w0 is doing a 32 bit move (whereas mov x4, x0 would be
the 64 bit move) that will clear the top 32 bits of the destination
register.
thanks
-- PMM