Re: [Libunwind-devel] x86-64 libunwind status?

I'm hoping someone can help me understand the following lines in
x86_64/Gstep.c:

   line 142:     /* Heuristic to recognize a bogus frame pointer */
                 ret = dwarf_get (&c->dwarf, rbp_loc, &rbp1);
                 if (ret || ((rbp1 - rbp) > 0x4000))
                   rbp_loc = DWARF_NULL_LOC;

This is the case when

   - dwarf_step() failed
   - we're not in a signal frame
   - RBP is non-zero

When libunwind is used as an in-process unwinder, we need to be extremely careful in not dereferencing bogus pointers - which can lead to the process dying with a SIGSEGV.

After dwarf_step() has failed and we've determined that we're not in a signal frame, we're not sure that RBP is valid ( i.e. we don't know if the program was compiled with -fno-omit-frame-pointer). In other words we're essentially guessing that RBP *may* be valid.

The above code was meant to be a defence against such cases (and bugs not yet found in libunwind).

Where did this heuristic come from? Is it really only trying to test if
*RBP is _greater_ than (RBP + 0x4000), or _within_ 0x4000?

Since the stack grows downwards on x64, *RBP must be greater than RBP in the normal case.

If I debug a simple himom program on my system, with a breakpoint at main(),
I have

   (gdb) p/x $rbp
   $1 = 0x7fff97519570
   (gdb) x/1g $rbp
   0x7fff97519570: 0x0000000000400550

which are similar values to what I'm seeing.  Why would this be recognized
as a "bogus frame pointer?"

It all depends on how you compiled your himom program. Did you compile with or without frame pointers?

In general, I haven't run into a case where there is an unusual jump (> 0x4000 bytes) in the frame pointer value. If you're running into such a case, I'd like to reproduce it.

-Arun

From:	Arun Sharma
Subject:	Re: [Libunwind-devel] x86-64 libunwind status?
Date:	Wed, 17 Oct 2007 23:11:57 -0700