I'm hoping someone can help me understand the following lines in
x86_64/Gstep.c:
line 142: /* Heuristic to recognize a bogus frame pointer */
ret = dwarf_get (&c->dwarf, rbp_loc, &rbp1);
if (ret || ((rbp1 - rbp) > 0x4000))
rbp_loc = DWARF_NULL_LOC;
This is the case when
- dwarf_step() failed
- we're not in a signal frame
- RBP is non-zero
When libunwind is used as an in-process unwinder, we need to be extremely careful in not dereferencing bogus pointers - which can lead to the process dying with a SIGSEGV.
After dwarf_step() has failed and we've determined that we're not in a signal frame, we're not sure that RBP is valid (
i.e. we don't know if the program was compiled with -fno-omit-frame-pointer). In other words we're essentially guessing that RBP *may* be valid.
The above code was meant to be a defence against such cases (and bugs not yet found in libunwind).
Where did this heuristic come from? Is it really only trying to test if
*RBP is _greater_ than (RBP + 0x4000), or _within_ 0x4000?
Since the stack grows downwards on x64, *RBP must be greater than RBP in the normal case.
If I debug a simple himom program on my system, with a breakpoint at main(),
I have
(gdb) p/x $rbp
$1 = 0x7fff97519570
(gdb) x/1g $rbp
0x7fff97519570: 0x0000000000400550
which are similar values to what I'm seeing. Why would this be recognized
as a "bogus frame pointer?"
It all depends on how you compiled your himom program. Did you compile with or without frame pointers?
In general, I haven't run into a case where there is an unusual jump (> 0x4000 bytes) in the frame pointer value. If you're running into such a case, I'd like to reproduce it.
-Arun