bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64735: 29.0.92; find invocations are ~15x slower because of ignores


From: Eli Zaretskii
Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Date: Sun, 23 Jul 2023 09:15:22 +0300

> From: Spencer Baugh <sbaugh@janestreet.com>
> Cc: sbaugh@catern.com,  yantar92@posteo.net,  rms@gnu.org,
>    dmitry@gutov.dev,  michael.albinus@gmx.de,  64735@debbugs.gnu.org
> Date: Sat, 22 Jul 2023 16:53:05 -0400
> 
> Can you try this further change on your Windows (and GNU/Linux) box?  I
> just tested on a different box and my original change gets:
> 
> (("built-in" . "Elapsed time: 4.506643s (2.276269s in 21 GCs)")
>  ("with-find" . "Elapsed time: 4.114531s (2.848497s in 27 GCs)"))
> 
> while this parallel implementation gets
> 
> (("built-in" . "Elapsed time: 4.479185s (2.236561s in 21 GCs)")
>  ("with-find" . "Elapsed time: 2.858452s (1.934647s in 19 GCs)"))
> 
> so it might have a favorable impact on Windows and your other GNU/Linux
> box.

Almost no effect here on MS-Windows:

  (("built-in" . "Elapsed time: 0.859375s (0.093750s in 4 GCs)")
   ("with-find" . "Elapsed time: 8.437500s (0.078125s in 4 GCs)"))

It was 8.578 sec with the previous version.

(The Lisp version is somewhat faster in this test because I
native-compiled the code for this test.)

On GNU/Linux:

  (("built-in" . "Elapsed time: 4.244898s (1.934182s in 56 GCs)")
   ("with-find" . "Elapsed time: 3.011574s (1.190498s in 35 GCs)"))

Faster by 10% (previous version yielded 3.327 sec).

Btw, I needed to fix the code: when-let needs 2 open parens after it,
not one.  The original code signals an error from the filter function
in Emacs 29.

> >>   (cl-assert (null _predicate) t "find-directory-files-recursively can't 
> >> accept arbitrary predicates")
> >
> > It should.
> 
> This is where I think a fallback would be useful - it's basically
> impossible to support arbitrary predicates efficiently here, since it
> requires us to put Lisp in control of whether find descends into a
> directory.

There's nothing wrong with supporting this less efficiently.

And there's no need to control where Find descends: you could just
filter out the files from those directories that need to be ignored.

> So I'm thinking I would just fall back to running the old
> directory-files-recursively whenever there's a predicate.  Or just not
> supporting this at all...

We cannot not support it at all, because then it will not be a
replacement.  Fallback is okay, though I'd prefer a self-contained
function.

> >>         (if follow-symlinks
> >>             '("-L")
> >>           '("!" "(" "-type" "l" "-xtype" "d" ")"))
> >>         (unless (string-empty-p regexp)
> >>           "-regex" (concat ".*" regexp ".*"))
> >>         (unless include-directories
> >>           '("!" "-type" "d"))
> >>         '("-print0")
> >
> > Some of these switches are specific to GNU Find.  Are we going to
> > support only GNU Find?
> 
> POSIX find doesn't support -regex, so I think we have to.  We could
> stick to just POSIX find if we only allowed globs in
> find-directory-files-recursively, instead of full regexes.

The latter would again be incompatible with
directory-files-recursively, so it isn't TRT, IMO.

One other subtlety is non-ASCII file names: you use -print0 switch to
Find, which produces null bytes, and those could inhibit decoding of
non-ASCII characters. So you may need to bind
inhibit-null-byte-detection to a non-nil value to get correctly
decoded file names you get from Find.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]