[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#38748: 28.0.50; crash on MacOS 10.15.2
From: |
Robert Pluim |
Subject: |
bug#38748: 28.0.50; crash on MacOS 10.15.2 |
Date: |
Fri, 10 Jan 2020 09:58:52 +0100 |
>>>>> On Fri, 10 Jan 2020 10:27:45 +0200, Eli Zaretskii <eliz@gnu.org> said:
>> From: Pip Cet <pipcet@gmail.com>
>> Date: Fri, 10 Jan 2020 07:32:07 +0000
>> Cc: rpluim@gmail.com, alan@idiocy.org, jguenther@gmail.com,
>> andreyk.mad@gmail.com, 38748@debbugs.gnu.org
>>
>> > The backtrace shows a very recursive GC, it doesn't show any other
>> > function being deeply recursive. So I'm not sure I understand what
>> > tail-recursive function did you have in mind. Can you elaborate?
>>
>> I can. I think we're looking at two bugs: the first is the simple
>> use-after-free of XFRAME (frame)->output_data.ns where `frame' is a
>> dead frame. I've confirmed on GNU/Linux that mark_frame is called for
>> a frame for which x_free_frame_resources has already been called, if
>> there's a global variable still referencing the frame. I think the
>> same thing happens on macOS.
Eli> This one doesn't depend on the 'ok's initialization in
Eli> face_inherited_attr in any way, does it?
No, it doesnʼt.
>> 1. I think face_inherited_attr is being optimized to tail-call itself
>> rather than calling itself in a new stack frame; thus, it loops
>> indefinitely for a faulty face setup which would otherwise lead to an
>> immediate crash.
>> 1b. that optimization only works without the harmless initialization of
"ok".
>>
>> 2. Our initial face setup is faulty in the sense above.
>>
>> 3. Something happens on a secondary thread which causes our face setup
>> to become non-faulty, possibly during GC.
Eli> What do you mean by "secondary thread"? And how can GC modify Lisp
Eli> data structures? that'd be a terrible bug.
Eli> In any case, the full backtrace shows no trace of face_inherited_attr
Eli> call anywhere in the callstack, so if there is indeed infinite
Eli> recursion in that function, it was somehow exited long ago by the time
Eli> GC runs.
Eli> As for the tail-recursion part: do you see any sign of that in the
Eli> disassembly posted by Robert? I didn't, but maybe I missed
Eli> something. And such subtleties should only rear their ugly heads in
Eli> optimized code, whereas we already know that an unoptimized build
Eli> crashes in the same way.
Iʼm attaching the disassembly of face_inherited_attr with -O2, with
and without the change to 'ok'. I canʼt see any tail recursion, and
modulo the use of r14 rather than r13, the only change I can see is
right at the end, where the return value is set up (disclaimer: Iʼm
not fluent in x86 assembler).
Eli> I still think the shortest way to finding the culprit here is to
Eli> patiently and painfully go over the last_marked array, deciphering
Eli> the Lisp object we marked, until we succeed in identifying the Lisp
Eli> data structure which got corrupted. Once we succeed in identifying
Eli> that data structure, it should be relatively easy to find who and
Eli> where corrupts it. This may mean a lot of inconvenient drudgery,
Eli> exacerbated by the fact that having a functional GDB on macOS is not
Eli> easy, but I don't think we have a better way at this point.
Itʼs possible that there is only one bug. The emacs Iʼve been using
with the change in nsterm.m suggested by Pip has been completely
stable. If it does crash again I can trawl through last_marked.
Robert
unmodified-optimized.txt
Description: Text document
modified-optimized.txt
Description: Text document
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Andrii Kolomoiets, 2020/01/01
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Eli Zaretskii, 2020/01/02
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Robert Pluim, 2020/01/08
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Pip Cet, 2020/01/08
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Eli Zaretskii, 2020/01/08
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Pip Cet, 2020/01/08
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Eli Zaretskii, 2020/01/08
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Pip Cet, 2020/01/10
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Eli Zaretskii, 2020/01/10
- bug#38748: 28.0.50; crash on MacOS 10.15.2,
Robert Pluim <=
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Eli Zaretskii, 2020/01/10
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Robert Pluim, 2020/01/10
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Pankaj Jangid, 2020/01/11
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Eli Zaretskii, 2020/01/11
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Pankaj Jangid, 2020/01/11
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Eli Zaretskii, 2020/01/11
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Pip Cet, 2020/01/10
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Eli Zaretskii, 2020/01/10
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Alan Third, 2020/01/11
- bug#38748: 28.0.50; crash on MacOS 10.15.2, Pip Cet, 2020/01/11