[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
From: |
Dmitry Gutov |
Subject: |
bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time |
Date: |
Thu, 23 Sep 2021 02:09:16 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 |
On 23.09.2021 00:58, Daniel Martín wrote:
Dmitry Gutov <dgutov@yandex.ru> writes:
IIRC you are using macOS. I received another report recently that
find/grep based tooling, and project-find-regexp in particular, are
pretty slow on that OS.
Yes, this is on macOS.
When you say "block for a long time", how long are we talking about?
To try it, evaluate
(benchmark 1 '(project-find-regexp "new-collection"))
I usually work on a monorepo with ~67000 tracked files (many of them big
binary files). Here's what I get when using ripgrep as the xref search
program:
Elapsed time: 36.087181s (8.067474s in 22 GCs)
Thanks for testing. Did the switch to ripgrep help much?
I wonder if we should advertise this setting and recommendation more
prominently, at least until we get auto-detection.
Running the same search with ripgrep from the command line takes around
6 seconds.
Is that with an SSD?
Your project sounds respectable. The torvalds-linux repo I have checked
out here is also 70000 files, but I guess your files are bigger.
Another benchmark to try is
(benchmark 1 '(project-files (project-current)))
Elapsed time: 1.590223s (0.432372s in 1 GCs)
That's a while (I wonder if you find 'project-find-file' usable with
this kind of performance), but still better than I might have expected.
Here's an ELisp profile of the first benchmark:
8696 78% - command-execute
8696 78% - call-interactively
8493 76% - funcall-interactively
8480 76% - eval-expression
8479 76% - eval
8479 76% - project-find-regexp
8227 74% - xref--show-xrefs
8227 74% - xref--show-xref-buffer
5584 50% - #<compiled 0x140b5a40100bafc6>
5584 50% - apply
5584 50% - project--find-regexp-in-files
5574 50% - xref-matches-in-files
3016 27% - xref--convert-hits
3000 27% - mapcan
2992 27% - #<compiled -0x6cdcd56218925c3>
2734 24% - xref--collect-matches
2094 18% - xref--collect-matches-1
800 7% + xref-make-match
774 7% + xref-make-file-location
104 0% xref--find-file-buffer
80 0% file-remote-p
51 0% xref--regexp-syntax-dependent-p
906 8% + xref--process-file-region
331 2% sort
1413 12% + xref--analyze
1230 11% + xref--show-common-initialize
249 2% + project-files
3 0% + project-current
9 0% + minibuffer-complete
4 0% + execute-extended-command
203 1% + byte-code
2314 20% - ...
2314 20% Automatic GC
27 0% + timer-event-handler
When you have a lot of matches, at some point Lisp overhead is going to
show up. E.g., the searches seem almost instantaneous with up to several
thousand matches here, but 10000s and 100000s - yeah, I have to wait.
Help with optimizations in that area (around/in xref-matches-in-files
and xref--convert-hits) is welcome, but I'm not sure how much more we
can squeeze.
The search time is reduced when I use a more specific search term,
presumably because the number of results is lower and the Elisp
post-processing takes less time. Here's what I got, for example, when I
search for something with results from only one file:
Elapsed time: 6.859815s (0.864738s in 2 GCs)
Compared to the time taken by the same query from the command line
(6.5s) shows that the Elisp post-processing time is probably negligible
in this scenario.
It's a good result. A little suspicious, though: given that
project-find-regexp calls project-files first, and the latter takes
1.5s, the difference should ~ that time. But I guess rg also needs to
traverse the directory tree, and spends some time on doing that too.
What else can be done -- again, if someone wants to investigate an
asynchronous/nonblocking API for Xref (or using threads) -- welcome. The
case when most of the time is spent in the subprocess is a good match.
But I don't think we'll manage this for the upcoming release.
Another thing you can do is set up the additional ignores for the
project. If those big binary files are not something you are interested
in searching and touching, you could add ignore entries for them. When
the vc project backend is in use (default), it is currently done via
.dir-locals.el: the variable is project-vc-ignores, it's a list of
strings that should be globs. See its docstring and the explanation in
project-ignores's docstring.
Note that ignores also affect project-find-file.
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Daniel Martín, 2021/09/22
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Dmitry Gutov, 2021/09/22
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Daniel Martín, 2021/09/22
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time,
Dmitry Gutov <=
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Dmitry Gutov, 2021/09/23
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Daniel Martín, 2021/09/23
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Dmitry Gutov, 2021/09/23
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Eli Zaretskii, 2021/09/24
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Juri Linkov, 2021/09/24
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Eli Zaretskii, 2021/09/24
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Juri Linkov, 2021/09/24
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Eli Zaretskii, 2021/09/24
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Gregory Heytings, 2021/09/24
- bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, Eli Zaretskii, 2021/09/24