bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64735: 29.0.92; find invocations are ~15x slower because of ignores


From: Dmitry Gutov
Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Date: Sat, 29 Jul 2023 03:12:34 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 27/07/2023 16:30, Dmitry Gutov wrote:
I can imagine that the filter-based approach necessarily creates more strings (to pass to the filter function). Maybe we could increase those strings' size (thus reducing the number) by increasing the read buffer size?

To go further along this route, first of all, I verified that the input strings are (almost) all the same length: 4096. And they are parsed into strings with length 50-100 characters, meaning the number of "junk" objects due to the process-filter approach probably shouldn't matter too much, given that the number of strings returned is 40-80x more.

But then I ran these tests with different values of read-process-output-max, which exactly increased those strings' size, proportionally reducing their number. The results were:

> (my-bench-rpom 1 default-directory "")

=>

(("with-find-p 4096" . "Elapsed time: 0.945478s (0.474680s in 6 GCs)")
 ("with-find-p 40960" . "Elapsed time: 0.760727s (0.395379s in 5 GCs)")
("with-find-p 409600" . "Elapsed time: 0.729757s (0.394881s in 5 GCs)"))

where

(defun my-bench-rpom (count path regexp)
  (setq path (expand-file-name path))
  (list
   (cons "with-find-p 4096"
         (let ((read-process-output-max 4096))
(benchmark count (list 'find-directory-files-recursively-2 path regexp))))
   (cons "with-find-p 40960"
         (let ((read-process-output-max 40960))
(benchmark count (list 'find-directory-files-recursively-2 path regexp))))
   (cons "with-find-p 409600"
         (let ((read-process-output-max 409600))
(benchmark count (list 'find-directory-files-recursively-2 path regexp))))))

...with the last iteration showing consistently the same or better performance than the "sync" version I benchmarked previously.

What does that mean for us? The number of strings in the heap is reduced, but not by much (again, the result is a list with 43x more elements). The combined memory taken up by these intermediate strings to be garbage-collected, is the same.

It seems like per-chunk overhead is non-trivial, and affects GC somehow (but not in a way that just any string would).

In this test, by default, the output produces ~6000 strings and passes them to the filter function. Meaning, read_and_dispose_of_process_output is called about 6000 times, producing the overhead of roughly 0.2s. Something in there must be producing extra work for the GC.

This line seems suspect:

       list3 (outstream, make_lisp_proc (p), text),

Creates 3 conses and one Lisp object (tagged pointer). But maybe I'm missing something bigger.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]