[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
From: |
Dmitry Gutov |
Subject: |
bug#64735: 29.0.92; find invocations are ~15x slower because of ignores |
Date: |
Thu, 20 Jul 2023 21:54:32 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 |
On 20/07/2023 16:43, Spencer Baugh wrote:
That's only a problem when the default file listing logic is used (and
we usually delegate to something like 'git ls-files' instead, when the
vc-aware backend is used).
Hm, yes, but things like C-u project-find-regexp will use the default
find-based file listing logic instead of git ls-files, as do a few other
things.
Right.
I wonder, could we just go ahead and make a vc function which is
list-files(GLOBS) and returns a list of files? Both git and hg support
this. Then we could have C-u project-find-regexp use that instead of
find, by taking the cross product of dirs-to-search and
file-name-patterns-to-search. (And this would let me delete a big chunk
of my own project backend, so I'd be happy to implement it.)
I started out on this inside the branch scratch/project-regen. Didn't
have time to dedicate to it recently, but the basics are there, take a
look (the method is called project-files-filtered).
The difficulty with making such changes, is the project protocol grows
in size, it becomes difficult for a user to understand what is
mandatory, what's obsolete, and how to use it, especially in the face of
backward compatibility requirements.
Take a look, feedback is welcome, it should help move this forward. We
should also transition to returning relative file names when possible,
for performance (optionally or always).
Fundamentally it seems a little silly for project-ignores to ever be
used for a vc project; if the vcs gives us ignores, we can probably just
ask the vcs to list the files too, and it will have an efficient
implementation of that.
Possibly, yes. But there will likely remain cases when the project-files
could stay useful for callers, to construct some bigger command line for
some new feature. Though perhaps we'll be able to drop that need by
extracting the theoretically best performance from project-files (using
a process object or some abstraction), to facilitate low-overhead piping.
If we do that uniformly, then this find slowness would only affect
transient projects, and transient projects pull their ignores from
grep-find-ignored-files just like rgrep, so improvements will more
easily be applied to both. (And maybe we could even get rid of
project-ignores entirely, then?)
Regarding removing it, see above. And it'll take a number of years
anyway ;-(
bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/07/20
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/20
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/07/20
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/20
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/07/20
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/20
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Dmitry Gutov, 2023/07/20
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/21
bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/20
bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/20