bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64735: 29.0.92; find invocations are ~15x slower because of ignores


From: Michael Albinus
Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Date: Sun, 23 Jul 2023 13:44:28 +0200
User-agent: Gnus/5.13 (Gnus v5.13)

Spencer Baugh <sbaugh@janestreet.com> writes:

Hi Spencer,

> I mean having Emacs read output from the process and turn them into
> strings while find is still running and walking the directory tree.  So
> the two parts are running in parallel.  This, specifically:

Just as POC, I have modified your function slightly that it runs with
both local and remote directories.

--8<---------------cut here---------------start------------->8---
(defun find-directory-files-recursively (dir regexp &optional 
include-directories _predicate follow-symlinks)
  (let* (buffered
         result
         (remote (file-remote-p dir))
         (file-name-handler-alist (and remote file-name-handler-alist))
         (proc
          (make-process
           :name "find" :buffer nil
           :connection-type 'pipe
           :noquery t
           :sentinel #'ignore
           :file-handler remote
           :filter (lambda (proc data)
                     (let ((start 0))
                       (when-let ((end (string-search "\0" data start)))
                         (push (concat buffered (substring data start end)) 
result)
                         (setq buffered "")
                         (setq start (1+ end))
                         (while-let ((end (string-search "\0" data start)))
                           (push (substring data start end) result)
                           (setq start (1+ end))))
                       (setq buffered (concat buffered (substring data 
start)))))
           :command (append
                     (list "find" (file-local-name dir))
                     (if follow-symlinks
                         '("-L")
                       '("!" "(" "-type" "l" "-xtype" "d" ")"))
                     (unless (string-empty-p regexp)
                       "-regex" (concat ".*" regexp ".*"))
                     (unless include-directories
                       '("!" "-type" "d"))
                     '("-print0")
                     ))))
    (while (accept-process-output proc))
    (if remote (mapcar (lambda (file) (concat remote file)) result) result)))
--8<---------------cut here---------------end--------------->8---

This returns on my laptop

--8<---------------cut here---------------start------------->8---
(my-bench 100 "~/src/tramp" "")
(("built-in" . "Elapsed time: 99.177562s (3.403403s in 107 GCs)")
 ("with-find" . "Elapsed time: 83.432360s (2.820053s in 98 GCs)"))

(my-bench 100 "/ssh:remotehost:~/src/tramp" "")
(("built-in" . "Elapsed time: 128.406359s (34.981183s in 1850 GCs)")
 ("with-find" . "Elapsed time: 82.765064s (4.155410s in 163 GCs)"))
--8<---------------cut here---------------end--------------->8---

Of course the other problems still remain. For example, you cannot know
whether on a given host (local or remote) find supports all
arguments. On my NAS, for example, we have

--8<---------------cut here---------------start------------->8---
[~] # find -h
BusyBox v1.01 (2022.10.27-23:57+0000) multi-call binary

Usage: find [PATH...] [EXPRESSION]

Search for files in a directory hierarchy.  The default PATH is
the current directory; default EXPRESSION is '-print'

EXPRESSION may consist of:
        -follow         Dereference symbolic links.
        -name PATTERN   File name (leading directories removed) matches PATTERN.
        -print          Print (default and assumed).

        -type X         Filetype matches X (where X is one of: f,d,l,b,c,...)
        -perm PERMS     Permissions match any of (+NNN); all of (-NNN);
                        or exactly (NNN)
        -mtime TIME     Modified time is greater than (+N); less than (-N);
                        or exactly (N) days
--8<---------------cut here---------------end--------------->8---

Best regards, Michael.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]