bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22790: 24.5; Infinite loop involving malloc called from signal handl


From: Eli Zaretskii
Subject: bug#22790: 24.5; Infinite loop involving malloc called from signal handler
Date: Fri, 04 Mar 2016 11:42:04 +0200

> Date: Mon, 29 Feb 2016 16:44:30 +0200
> CC: Eli Zaretskii <eliz@gnu.org>
> From: Andreas Gustafsson <gson@gson.org>
> 
> The lockup happened again.  There's still a SIGINT handler involved,
> but at least there is only one of this time and not two recursive
> ones.
> 
> The full backtrace and some additional gdb output are included below,
> but I would think this two-line excerpt should be sufficient to
> identify the bug (or at least _a_ bug, if there is more than one):
> 
>   #9  0x00007f7ff60cc266 in printf () from /usr/lib/libc.so.12
>   #10 0x00000000004db715 in handle_interrupt (in_signal_handler=true) at 
> keyboard.c:10364
> 
> That is, printf() is not a signal safe function, so emacs is invoking
> undefined behavior by calling it from a signal handler.

Is this a GUI session or a text-mode terminal (a.k.a. "TTY") session?
If the former, handle_interrupt is not called from a SIGINT handler.

In any case, this code is run as part of the so-called "emergency
escape", when you type C-g more than once while Emacs is busy doing
something that cannot be interrupted.  In that situation, we are way
past the point where invoking undefined behavior is of any concern,
because the only thing we can do then is auto-save and commit
suicide.  The printf call you see on the stack is asking the user
whether to auto-save, and the next question is whether to abort.

> > In any case, when this happens next, please use the procedure
> > described in etc/DEBUG for locating the place where Emacs loops, and
> > post that information.
> 
> As you can see from the gdb transcript below, the "step" function
> didn't work, but "stepi" shows it looping within libpthread.

You need to use "finish", not "step" or "stepi".  I don't think
the loop can reasonably be inside libpthread, so you should try
getting back to the Emacs application code and out of calls to library
functions.  Typing "finish" repeatedly until you are in some Emacs
code is the way to achieve that.  But this should be done without
typing C-g first, because otherwise you might be forcibly taken out of
the loop, and there's no easy way to return there.

And I still don't understand why the SIGINT handler is in the
picture.  Did you type C-g when this lockup happened?

> Even if you consider the backtrace to be suspect, code inspection
> should suffice to show that the line
> 
>           printf ("Auto-save? (y or n) ");
> 
> in src/keyboard.c can be executed from a signal handler.

Indeed, it can.  But I don't think this is the reason for the problem
you are describing.  That code cannot be entered unless you type C-g
twice or more in a TTY session while Emacs is already in some
un-interruptible loop or system call.  It is that loop or system call
that we need to identify in order to fix this problem.

> (gdb) where
> #0  0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
> #1  0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
> #2  0x00007f7ff6c08848 in ?? () from /usr/lib/libpthread.so.1
> #3  0x00000000005c5486 in _malloc_internal (size=65536) at gmalloc.c:929
> #4  0x00000000005c54fc in malloc (size=65536) at gmalloc.c:953
> #5  0x00007f7ff60ed28c in __smakebuf () from /usr/lib/libc.so.12
> #6  0x00007f7ff60ed125 in __swsetup () from /usr/lib/libc.so.12
> #7  0x00007f7ff60cde92 in __vfprintf_unlocked () from /usr/lib/libc.so.12
> #8  0x00007f7ff60d1258 in vfprintf () from /usr/lib/libc.so.12
> #9  0x00007f7ff60cc266 in printf () from /usr/lib/libc.so.12
> #10 0x00000000004db715 in handle_interrupt (in_signal_handler=true) at 
> keyboard.c:10364
> #11 0x00000000004db63e in handle_interrupt_signal (sig=2) at keyboard.c:10288
> #12 0x00000000004e8b63 in deliver_process_signal (sig=2, handler=0x4db5f1 
> <handle_interrupt_signal>) at sysdep.c:1570
> #13 0x00000000004db65a in deliver_interrupt_signal (sig=2) at keyboard.c:10295
> #14 <signal handler called>
> #15 0x00007f7ff6c083e2 in ?? () from /usr/lib/libpthread.so.1
> #16 0x00007f7ff6c08445 in ?? () from /usr/lib/libpthread.so.1
> #17 0x00007f7ff6c08848 in ?? () from /usr/lib/libpthread.so.1
> #18 0x00000000005c5486 in _malloc_internal (size=1000) at gmalloc.c:929
> #19 0x00000000005c54fc in malloc (size=1000) at gmalloc.c:953
> #20 0x0000000000534f0d in xmalloc (size=1000) at alloc.c:677
> #21 0x000000000057968f in Fprinc (object=8564569, printcharfun=11946034) at 
> print.c:656
> #22 0x000000000057a544 in print_error_message (data=41076294, 
> stream=11944965, context=0x0, caller=11946034) at print.c:919
> #23 0x000000000057a238 in Ferror_message_string (obj=41076294) at print.c:844
> #24 0x000000000050e40e in auto_save_error (error_val=41076294) at 
> fileio.c:5425
> #25 0x000000000055787a in internal_condition_case (bfun=0x50e477 
> <auto_save_1>, handlers=11946082, hfun=0x50e3bf <auto_save_error>) at 
> eval.c:1345
> #26 0x000000000050eb76 in Fdo_auto_save (no_message=11946082, 
> current_only=11946034) at fileio.c:5672
> #27 0x00000000004cde3c in read_char (commandflag=1, map=41075894, 
> prev_event=11946034, used_mouse_menu=0x7f7fffff9c0f, end_time=0x0) at 
> keyboard.c:2751
> #28 0x00000000004d932a in read_key_sequence (keybuf=0x7f7fffff9df0, 
> bufsize=30, prompt=11946034, dont_downcase_last=false, 
> can_return_switch_frame=true, fix_current_buffer=true, 
> prevent_redisplay=false) at keyboard.c:9089
> #29 0x00000000004cb5b0 in command_loop_1 () at keyboard.c:1453
> #30 0x0000000000557882 in internal_condition_case (bfun=0x4cb1f1 
> <command_loop_1>, handlers=12016002, hfun=0x4cab3b <cmd_error>) at eval.c:1348
> #31 0x00000000004caf5d in command_loop_2 (ignore=11946034) at keyboard.c:1178
> #32 0x00000000005570b5 in internal_catch (tag=12108690, func=0x4caf37 
> <command_loop_2>, arg=11946034) at eval.c:1112
> #33 0x00000000004caec0 in command_loop () at keyboard.c:1149
> #34 0x00000000004ca737 in recursive_edit_1 () at keyboard.c:778
> #35 0x00000000005017dd in read_minibuf (map=40555366, initial=37407873, 
> prompt=18302785, expflag=false, histvar=12034962, histpos=0, defalt=11946034, 
> allow_props=false, inherit_input_method=false) at minibuf.c:674
> #36 0x0000000000501ffd in Fread_from_minibuffer (prompt=18302785, 
> initial_contents=37407873, keymap=40555366, read=11946034, hist=12034962, 
> default_value=11946034, inherit_input_method=11946034) at minibuf.c:957
> #37 0x000000000055ab18 in Ffuncall (nargs=8, args=0x7f7fffffa398) at 
> eval.c:2837
> #38 0x0000000000599506 in exec_byte_code (bytestr=9425233, vector=9425269, 
> maxdepth=72, args_template=8200, nargs=8, args=0x7f7fffffa918) at 
> bytecode.c:916

This tells the following story:

 . Emacs was running some byte code
 . that byte code tried to read from the minibuffer, probably after
   asking some question or prompting for some input
 . as part of that prompt, Emacs attempted to auto-save modified
   buffers
 . the auto-save attempt signaled an error
 . Emacs wanted to display the error message, and called malloc
 . then somehow SIGINT was delivered

Does this match what you were doing?  Any reason why auto-saving could
fail (some filesystem that could be off-line, for example)?  And where
did that SIGINT come from?

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]