bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64735: 29.0.92; find invocations are ~15x slower because of ignores


From: Dmitry Gutov
Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Date: Thu, 20 Jul 2023 21:54:32 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 20/07/2023 16:43, Spencer Baugh wrote:

That's only a problem when the default file listing logic is used (and
we usually delegate to something like 'git ls-files' instead, when the
vc-aware backend is used).

Hm, yes, but things like C-u project-find-regexp will use the default
find-based file listing logic instead of git ls-files, as do a few other
things.

Right.

I wonder, could we just go ahead and make a vc function which is
list-files(GLOBS) and returns a list of files?  Both git and hg support
this.  Then we could have C-u project-find-regexp use that instead of
find, by taking the cross product of dirs-to-search and
file-name-patterns-to-search.  (And this would let me delete a big chunk
of my own project backend, so I'd be happy to implement it.)

I started out on this inside the branch scratch/project-regen. Didn't have time to dedicate to it recently, but the basics are there, take a look (the method is called project-files-filtered).

The difficulty with making such changes, is the project protocol grows in size, it becomes difficult for a user to understand what is mandatory, what's obsolete, and how to use it, especially in the face of backward compatibility requirements.

Take a look, feedback is welcome, it should help move this forward. We should also transition to returning relative file names when possible, for performance (optionally or always).

Fundamentally it seems a little silly for project-ignores to ever be
used for a vc project; if the vcs gives us ignores, we can probably just
ask the vcs to list the files too, and it will have an efficient
implementation of that.

Possibly, yes. But there will likely remain cases when the project-files could stay useful for callers, to construct some bigger command line for some new feature. Though perhaps we'll be able to drop that need by extracting the theoretically best performance from project-files (using a process object or some abstraction), to facilitate low-overhead piping.

If we do that uniformly, then this find slowness would only affect
transient projects, and transient projects pull their ignores from
grep-find-ignored-files just like rgrep, so improvements will more
easily be applied to both.  (And maybe we could even get rid of
project-ignores entirely, then?)

Regarding removing it, see above. And it'll take a number of years anyway ;-(





reply via email to

[Prev in Thread] Current Thread [Next in Thread]