bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#66117: 30.0.50; `find-buffer-visiting' is slow when opening large nu


From: Eli Zaretskii
Subject: bug#66117: 30.0.50; `find-buffer-visiting' is slow when opening large number of buffers
Date: Fri, 29 Sep 2023 19:12:58 +0300

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: dmitry@gutov.dev, 66117@debbugs.gnu.org
> Date: Fri, 29 Sep 2023 13:56:40 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> 3. Open all the 1000 files one by one:
> >>    (dolist (file (directory-files "/tmp/test/" t "org"))
> >>      (unless (find-buffer-visiting file) (find-file-noselect file)))
> >> 
> >> Step (3) takes 18.8 seconds on my machine. The CPU profile attached as
> >> cpu-profile.
> >
> > Since find-file-noselect calls find-buffer-visiting internally, I'm
> > not sure the above test case makes sense.  A Lisp program should feel
> > free to call find-file-noselect directly, and Emacs will find the
> > visiting buffer, if it already exists, as part of the job of
> > find-file-noselect.
> >
> > Let's please focus on test cases where the Lisp code being benchmarked
> > doesn't do any unnecessary stuff, since what's at stake is a
> > significant change in our internals.
> 
> The reason I left an extra `find-buffer-visiting' call was because Org
> mode does it (for a reason - we need information if a file was already
> open or not).
> 
> You may as well do
> 
>             (dolist (file (directory-files "/tmp/test/" t "org"))
>               (find-file-noselect file))
> 
> as step (3).
> 
> The same conclusions will hold - `find-file-noselect' calls
> `find-buffer-visiting' as well and it also takes most of the CPU time.
> 
> I am attaching an updated set of the same profiles, but based on the
> above `dolist' that only calls `find-file-noselect'.
> 
> The run times are now: 12.0 seconds, 5.3 seconds, and 6.6 seconds.

12 sec is quite a far cry from 18.8, won't you agree?

> >> If one uses `get-file-buffer' instead of `find-buffer-visiting', the
> >> total runtime becomes 5.1 sec - almost 4x faster.
> >
> > This is also not very interesting, since find-file-noselect calls
> > get-file-buffer as well.
> 
> No. `find-file-noselect' calls `find-buffer-visiting'.

Unless we use different Emacsen, find-file-noselect calls both
get-file-buffer and find-buffer-visiting:

      (let* ((buf (get-file-buffer filename))  <<<<<<<<<<<<<<<<<<<<<<<<<<<<
             (truename (abbreviate-file-name (file-truename filename)))
             (attributes (file-attributes truename))
             (number (file-attribute-file-identifier attributes))
             ;; Find any buffer for a file that has same truename.
             (other (and (not buf)
                         (find-buffer-visiting <<<<<<<<<<<<<<<<<<<<<<<<<<<<
                          filename
                          ;; We want to filter out buffers that we've
                          ;; visited via symlinks and the like, where
                          ;; the symlink no longer exists.
                          (lambda (buffer)
                            (let ((file (buffer-local-value
                                         'buffer-file-name buffer)))
                              (and file (file-exists-p file))))))))

> > If we come to the conclusion that those loops in find-buffer-visiting
> > are the hot spot, the right thing is to implement them in C, where we
> > don't need to use the equivalent of with-current-buffer to examine the
> > truename and file-number of every buffer, we can just access them
> > directly.
> 
> I still think that my previous conclusions are true. And I agree that
> rewriting these expensive loops in C makes sense. Maybe two new
> subroutines to find buffer by `buffer-file-truename' and by
> `buffer-file-number'?

Yes, that's what I had in mind.

> >> So, using `with-current-buffer' when looping over all the buffers is
> >> certainly not optimal (maybe in other places as well).
> >
> > with-current-buffer is normally very expensive.  Which is why any
> > performance-critical loop should try to avoid it as much as possible.
> 
> Aside: this reminds me about obsoletion of generalized buffer-local
> variable. AFAIU, there is currently no way to set buffer-local value in
> buffer without setting that buffer to current. It would be nice if such
> setting were possible, especially in performance-critical code.

Maybe, but is there any performance-critical code which needs that?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]