bug#70914: 29.3; Crashes often on Windows

bug-gnu-emacs
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#70914: 29.3; Crashes often on Windows

From:	Simen Endsjø
Subject:	bug#70914: 29.3; Crashes often on Windows
Date:	Tue, 21 May 2024 22:31:52 +0200
Look at that! I tried running it twice, and it reported the same location both
times.

This is when opening my "system.org" file. My D: is a VHD DevDrive
(ReFs), but I have experiencing crashes since way before I migrated to a
DevDrive.

I have some symlinked folders further down the tree too. And Developer Mode
enabled which allows me to register symlinks without admin rights.

I have LongPathsEnabled, ref
https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry

    Reading symbols from
D:\tmp\emacs-bug-70914\emacs-extra-checks-no-ucrt\simendsjo-build\bin\emacs.exe...
    (gdb) start --init-directory=d:/.emacs.d
    warning: could not convert 'main' from the host encoding (CP65001)
to UTF-32.
    This normally should not happen, please file a bug report.
    Temporary breakpoint 1 at 0x40015d096: file emacs.c, line 1242, column 8.
    Starting program:
D:\tmp\emacs-bug-70914\emacs-extra-checks-no-ucrt\simendsjo-build\bin\emacs.exe
--init-directory=d:/.emacs.d
    [New Thread 6960.0x77ec]
    [New Thread 6960.0x7634]
    [New Thread 6960.0x22c8]

    Thread 1 hit Temporary breakpoint 1, main (argc=2, argv=0x1d4930)
at emacs.c:1242
    1242      bool no_loadup = false;
                   ^
    (gdb) record btrace pt
    (gdb) c
    Continuing.
    [New Thread 6960.0x5b54]
    [New Thread 6960.0x531c]
    [New Thread 6960.0x6934]
    [Thread 6960.0x6934 exited with code 1]
    [New Thread 6960.0x1a5c]
    [Thread 6960.0x1a5c exited with code 1]
    [New Thread 6960.0x4d10]
    [Thread 6960.0x4d10 exited with code 1]
    [New Thread 6960.0x4148]
    [Thread 6960.0x4148 exited with code 1]
    [New Thread 6960.0x715c]
    [Thread 6960.0x715c exited with code 1]
    [New Thread 6960.0x8a0]
    [New Thread 6960.0x4874]
    [Thread 6960.0x4874 exited with code 1]
    [New Thread 6960.0x6fbc]
    [Thread 6960.0x6fbc exited with code 1]
    [New Thread 6960.0x240]
    [Thread 6960.0x240 exited with code 1]
    [New Thread 6960.0x158c]
    [Thread 6960.0x158c exited with code 1]
    [New Thread 6960.0x212c]
    [New Thread 6960.0x14e4]
    [New Thread 6960.0x6558]
    [New Thread 6960.0x5614]
    [New Thread 6960.0x2010]
    [New Thread 6960.0x234]
    [Thread 6960.0x234 exited with code 1]
    [New Thread 6960.0x7258]
    [Thread 6960.0x7258 exited with code 1]
    [New Thread 6960.0x4db0]
    [Thread 6960.0x4db0 exited with code 1]
    [New Thread 6960.0x29cc]
    [Thread 6960.0x29cc exited with code 1]

    Thread 1 received signal SIGSEGV, Segmentation fault.
    0x0000000000000000 in ?? ()
    (gdb) bt
    #0  0x0000000000000000 in ?? ()
    #1  0x0000000000000000 in ?? ()
    (gdb) reverse-stepi
    0x00007ff7beeabe9d in get_volume_info (name=<unavailable>,
pPath=<unavailable>) at w32.c:3502
    3502    }
            ^
    (gdb) bt
    #0  0x00007ff7beeabe9d in get_volume_info (name=<unavailable>,
pPath=<unavailable>) at w32.c:3502
    #1  0x0000000000000000 in ?? ()
    Backtrace stopped: not enough registers or memory available to
unwind further

On Tue, May 21, 2024 at 9:05 PM Hannes Domani <ssbssa@yahoo.de> wrote:
>
>  Am Dienstag, 21. Mai 2024 um 20:31:37 MESZ hat Eli Zaretskii <eliz@gnu.org> 
> Folgendes geschrieben:
>
> > > From: Simen Endsjø <simendsjo@gmail.com>
> > > Date: Tue, 21 May 2024 19:39:13 +0200
> > > Cc: 70914@debbugs.gnu.org, corwin@bru.st
> > >
> > > > Could you please show the last few
> > > > DLLs loaded by thread 1?
> > >
> > > Heres only Thread 1:
> >
> > Thanks.
> >
> > I'm sorry, I'm out of ideas.  Maybe someone else will have
> > suggestions.  Or maybe we are lucky and someone else reports similar
> > crashes with additional info.  I can only say that Emacs works
> > flawlessly for me on Windows 11 (but it's Emacs I build myself with
> > optional libraries most of which I also built myself).
> >
> > I asked on the GDB list for suggestions how to debug such crashes,
> > maybe someone there will come up with a useful suggestion.
>
> I'm saw it on the GDB list, and tried to reproduce a similar backtrace
> with just zeros.
>
> I came up with this:
> ```
> // compile with -g -O1 -foptimize-sibling-calls
>
> #include <string.h>
>
> int other_function(int x)
> {
>   return x + 3;
> }
>
> int some_function(int size)
> {
>   int (*function_ptr)(int) = &other_function;
>   memset(&function_ptr, 0, size);
>   return function_ptr(5);
> }
>
> int main()
> {
>   int y = some_function(100);
>   return y;
> }
> ```
>
> So when i intentionally break the stack, and make it 'jump' there with
> a sibling call, I get this result in GDB:
> ```
> C:\src\test>gcc -o no-bt.exe no-bt.c -g -O1 -foptimize-sibling-calls
>
> C:\src\test>gdb -q no-bt.exe
> Reading symbols from no-bt.exe...
> (gdb) r
> Starting program: C:\src\test\no-bt.exe
> [New Thread 7632.0x4328]
>
> Thread 1 received signal SIGSEGV, Segmentation fault.
> 0x0000000000000000 in ?? ()
> (gdb) bt
> #0  0x0000000000000000 in ?? ()
> #1  0x0000000000000000 in ?? ()
> (gdb)
> ```
>
> Usually I would say it's not possible to find out anything from here,
> but if you have a recent Win10, and a recent enough Intel CPU (which I
> think you do from what I saw in this ticket), then you could try
> out my GDB build [1] which includes some extra stuff that I haven't
> upstreamed (yet).
>
> In particular I ported 'record btrace pt' [2] to Windows.
>
> With the same example I can do this:
> ```
> C:\src\test>gdb -q no-bt.exe
> Reading symbols from no-bt.exe...
> (gdb) start
> Temporary breakpoint 1 at 0x7ff7a2d2163a: file no-bt.c, line 18.
> Starting program: C:\src\test\no-bt.exe
> [New Thread 3924.0x3afc]
>
> Thread 1 hit Temporary breakpoint 1, main () at no-bt.c:18
> 18      {
> (gdb) record btrace pt
> (gdb) c
> Continuing.
>
> Thread 1 received signal SIGSEGV, Segmentation fault.
> 0x0000000000000000 in ?? ()
> (gdb) bt
> #0  0x0000000000000000 in ?? ()
> #1  0x0000000000000000 in ?? ()
> (gdb) reverse-stepi
> 0x00007ff7a2d21637 in some_function (size=<optimized out>) at no-bt.c:14
> 14        return function_ptr(5);
> (gdb) bt
> #0  0x00007ff7a2d21637 in some_function (size=<optimized out>) at no-bt.c:14
> #1  0x00007ff7a2d2164e in main () at no-bt.c:19
> Backtrace stopped: not enough registers or memory available to unwind further
> (gdb)
> ```
>
> So with the recording it's possible to go back one instruction from the
> crash, and it shows me the location it jumped from.
>
> Note that this is not a 'full' recording, so the amount of steps
> to go back is limited (though the buffer-size for that can be changed
> with e.g.  'set record btrace pt buffer-size 65536'), and you
> can NOT inspect variables while replaying the execution.
>
> But for this kind of problem it's hopefully enough to see what the
> previous instructions were.
>
>
> Hannes
>
>
> [1] https://github.com/ssbssa/gdb/releases
> [2] 
> https://sourceware.org/gdb/current/onlinedocs/gdb.html/Process-Record-and-Replay.html
[Prev in Thread]
Current Thread
[Next in Thread]
bug#70914: 29.3; Crashes often on Windows, (continued)
Prev by Date: bug#71081: 30.0.50; shell-command-on-region outputs boilerplate text on Windows
Next by Date: bug#70914: 29.3; Crashes often on Windows
Previous by thread: bug#70914: 29.3; Crashes often on Windows
Next by thread: bug#70914: 29.3; Crashes often on Windows
Index(es):
- Date
- Thread