bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#66117: 30.0.50; `find-buffer-visiting' is slow when opening large nu


From: Ihor Radchenko
Subject: bug#66117: 30.0.50; `find-buffer-visiting' is slow when opening large number of buffers
Date: Fri, 29 Sep 2023 13:56:40 +0000

Eli Zaretskii <eliz@gnu.org> writes:

>> 3. Open all the 1000 files one by one:
>>    (dolist (file (directory-files "/tmp/test/" t "org"))
>>      (unless (find-buffer-visiting file) (find-file-noselect file)))
>> 
>> Step (3) takes 18.8 seconds on my machine. The CPU profile attached as
>> cpu-profile.
>
> Since find-file-noselect calls find-buffer-visiting internally, I'm
> not sure the above test case makes sense.  A Lisp program should feel
> free to call find-file-noselect directly, and Emacs will find the
> visiting buffer, if it already exists, as part of the job of
> find-file-noselect.
>
> Let's please focus on test cases where the Lisp code being benchmarked
> doesn't do any unnecessary stuff, since what's at stake is a
> significant change in our internals.

The reason I left an extra `find-buffer-visiting' call was because Org
mode does it (for a reason - we need information if a file was already
open or not).

You may as well do

              (dolist (file (directory-files "/tmp/test/" t "org"))
                (find-file-noselect file))

as step (3).

The same conclusions will hold - `find-file-noselect' calls
`find-buffer-visiting' as well and it also takes most of the CPU time.

I am attaching an updated set of the same profiles, but based on the
above `dolist' that only calls `find-file-noselect'.

The run times are now: 12.0 seconds, 5.3 seconds, and 6.6 seconds.


>> If one uses `get-file-buffer' instead of `find-buffer-visiting', the
>> total runtime becomes 5.1 sec - almost 4x faster.
>
> This is also not very interesting, since find-file-noselect calls
> get-file-buffer as well.

No. `find-file-noselect' calls `find-buffer-visiting'.

> If we come to the conclusion that those loops in find-buffer-visiting
> are the hot spot, the right thing is to implement them in C, where we
> don't need to use the equivalent of with-current-buffer to examine the
> truename and file-number of every buffer, we can just access them
> directly.

I still think that my previous conclusions are true. And I agree that
rewriting these expensive loops in C makes sense. Maybe two new
subroutines to find buffer by `buffer-file-truename' and by
`buffer-file-number'? They will be an equivalent of `get-file-buffer'
that searches by `buffer-file-name'.

>> So, using `with-current-buffer' when looping over all the buffers is
>> certainly not optimal (maybe in other places as well).
>
> with-current-buffer is normally very expensive.  Which is why any
> performance-critical loop should try to avoid it as much as possible.

Aside: this reminds me about obsoletion of generalized buffer-local
variable. AFAIU, there is currently no way to set buffer-local value in
buffer without setting that buffer to current. It would be nice if such
setting were possible, especially in performance-critical code.

Attachment: cpu-profile
Description: Binary data

Attachment: cpu-profile-get-file-buffer
Description: Binary data

Attachment: cpu-profile-buffer-local-value
Description: Binary data

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]