[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
From: |
Eli Zaretskii |
Subject: |
bug#64735: 29.0.92; find invocations are ~15x slower because of ignores |
Date: |
Sat, 22 Jul 2023 20:46:01 +0300 |
> From: sbaugh@catern.com
> Date: Sat, 22 Jul 2023 17:18:19 +0000 (UTC)
> Cc: sbaugh@janestreet.com, yantar92@posteo.net, rms@gnu.org, dmitry@gutov.dev,
> michael.albinus@gmx.de, 64735@debbugs.gnu.org
>
> First my results:
>
> (my-bench 100 "~/public_html" "")
> (("built-in" . "Elapsed time: 1.140173s (0.389344s in 5 GCs)")
> ("with-find" . "Elapsed time: 0.643306s (0.305130s in 4 GCs)"))
>
> (my-bench 10 "~/.local/src/linux" "")
> (("built-in" . "Elapsed time: 2.402341s (0.937857s in 11 GCs)")
> ("with-find" . "Elapsed time: 1.544024s (0.827364s in 10 GCs)"))
>
> (my-bench 100 "/ssh:catern.com:~/public_html" "")
> (("built-in" . "Elapsed time: 36.494233s (6.450840s in 79 GCs)")
> ("with-find" . "Elapsed time: 4.619035s (1.133656s in 14 GCs)"))
>
> 2x speedup on local files, and almost a 10x speedup for remote files.
Thanks, that's impressive. But you omitted some of the features of
directory-files-recursively, see below.
> And my implementation *isn't even using the fact that find can run in
> parallel with Emacs*. If I did start using that, I expect even more
> speed gains from parallelism, which aren't achievable in Emacs itself.
I'm not sure I understand what you mean by "in parallel" and why it
would be faster.
> So can we add something like this (with the appropriate fallbacks to
> directory-files-recursively), since it has such a big speedup even
> without parallelism?
We can have an alternative implementation, yes. But it should support
predicate, and it should sort the files in each directory like
directory-files-recursively does, so that it's a drop-in replacement.
Also, I believe that Find does return "." in each directory, and your
implementation doesn't filter them, whereas
directory-files-recursively does AFAIR.
And I see no need for any fallback: that's for the application to do
if it wants.
> (cl-assert (null _predicate) t "find-directory-files-recursively can't
> accept arbitrary predicates")
It should.
> (if follow-symlinks
> '("-L")
> '("!" "(" "-type" "l" "-xtype" "d" ")"))
> (unless (string-empty-p regexp)
> "-regex" (concat ".*" regexp ".*"))
> (unless include-directories
> '("!" "-type" "d"))
> '("-print0")
Some of these switches are specific to GNU Find. Are we going to
support only GNU Find?
> ))
> (remote (file-remote-p dir))
> (proc
> (if remote
> (let ((proc (apply #'start-file-process
> "find" (current-buffer) command)))
> (set-process-sentinel proc (lambda (_proc _state)))
> (set-process-query-on-exit-flag proc nil)
> proc)
> (make-process :name "find" :buffer (current-buffer)
> :connection-type 'pipe
> :noquery t
> :sentinel (lambda (_proc _state))
> :command command))))
> (while (accept-process-output proc))
Why do you call accept-process-output here? it could interfere with
reading output from async subprocesses running at the same time. To
come think of this, why use async subprocesses here and not
call-process?
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, (continued)
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, sbaugh, 2023/07/22
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/22
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores,
Eli Zaretskii <=
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/22
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/22
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Spencer Baugh, 2023/07/22
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Eli Zaretskii, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Ihor Radchenko, 2023/07/23
- bug#64735: 29.0.92; find invocations are ~15x slower because of ignores, Michael Albinus, 2023/07/23