bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64735: 29.0.92; find invocations are ~15x slower because of ignores


From: Eli Zaretskii
Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Date: Sat, 22 Jul 2023 14:58:46 +0300

> From: sbaugh@catern.com
> Date: Sat, 22 Jul 2023 10:38:37 +0000 (UTC)
> Cc: Spencer Baugh <sbaugh@janestreet.com>, dmitry@gutov.dev,
>       yantar92@posteo.net, michael.albinus@gmx.de, rms@gnu.org,
>       64735@debbugs.gnu.org
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> > No, the first step is to use in Emacs what Find does today, because it
> > will already be a significant speedup.
> 
> Why bother?  directory-files-recursively is a rarely used API, as you
> have mentioned before in this thread.

Because we could then use it much more (assuming the result will be
performant enough -- this remains to be seen).

> And there is a way to speed it up which will have a performance boost
> which is unbeatable any other way: Use find instead of
> directory-files-recursively, and operate on files as they find prints
> them.

Not every command can operate on the output sequentially: some need to
see all of the output, others will need to be redesigned and
reimplemented to support such sequential mode.

Moreover, piping from Find incurs overhead: data is broken into blocks
by the pipe or PTY, reading the data can be slowed down if Emacs is
busy processing something, etc.

So I think a primitive that traverses the tree and produces file names
with or without attributes, and can call some callback if needed,
still has its place.

> Since this runs the directory traversal in parallel with Emacs, it
> has a speed advantage that is impossible to match in
> directory-files-recursively.

See above: you have an optimistic view of what actually happens in the
relevant use cases.

> We can fall back to directory-files-recursively when find is not
> available.

Find is already available today on many platforms, and we are
evidently not happy enough with the results.  That is the trigger for
this discussion, isn't it?  We are talking about ways to improve the
performance, and I think having our own primitive that can do it is
one such way, or at least it is not clear that it cannot be such a
way.

> > Optimizing the case of a long
> > list of omissions should come later, as it is a minor optimization.
> 
> This seems wrong.  directory-files-recursively is rarely used, and rgrep
> is a very popular command, and this problem with find makes rgrep around
> ~10x slower by default.  How in any world is that a minor optimization?
> Most Emacs users will never realize that they can speed up rgrep
> massively by setting grep-find-ignored-files to nil.  Indeed, no-one
> realized that until I just pointed it out.  In my experience, they just
> stop using rgrep in favor of other third-party packages like ripgrep,
> because "grep is slow".

Making grep-find-ignored-files smaller is independent of this
particular issue.  If we can make it shorter, we should.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]