bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#66117: 30.0.50; `find-buffer-visiting' is slow when opening large nu


From: Ihor Radchenko
Subject: bug#66117: 30.0.50; `find-buffer-visiting' is slow when opening large number of buffers
Date: Tue, 26 Sep 2023 08:54:54 +0000

Eli Zaretskii <eliz@gnu.org> writes:

>> I feel that I am still missing where `buffer-file-name' is set when
>> opening file via C-x C-f. Debugger showed something weird in my testing.
>
> With local files, it seems like insert-file-contents sets it.  So
> maybe we should record the file name in the cache in bset_filename.

Thanks for the pointer!
AFAIU, the relevant code is

      if (NILP (handler))
        {
          current_buffer->modtime = mtime;
          current_buffer->modtime_size = st.st_size;
          bset_filename (current_buffer, orig_filename);

However, it looks like file handlers are responsible for setting the
filename. So,

         >   - ~tramp-handle-insert-file-contents~
         >   - ~tramp-archive-handle-insert-file-contents~
         >   - ~ange-ftp-insert-file-contents~
         >   - ~jka-compr-insert-file-contents~
         >   - ~mm-url-insert-file-contents~
         >   - ~epa-file-insert-file-contents~

may also need to handle the caching. And also all the third-party handlers.

>> Just to make sure that we are on the same page: the cache I am proposing
>> should be complete - if a buffer is missing from the cache, we should be
>> sure that there is no matching buffer.
>
> Since we will keep buffer-list (we must), even with this cache
> available, we can always leave the current code that scans the buffer
> list if the name is not in the cache.  This way, we don't need to
> worry to have all the buffers in the cache, only those which are
> looked for frequently and need the efficiency.

I need to elaborate then.

The problem Org faces happens when we open a file that is not yet opened
in Emacs. So, the FILENAME in question is missing from the buffer list
and `find-buffer-visiting' must (1) traverse every buffer in
`get-file-buffer'; (2) traverse every buffer again, checking
`buffer-file-name' values; (3) traverse every buffer yet again, checking
for `buffer-file-number'. We have the worst-case scenario for the
current code when the buffer with a given file name is not available and
all the checks fail.

To address the above scenario, it is not enough to cache _some_ buffer
names. Because not-yet-open FILENAME will be missing from the cache, but
we will still have to go through the above process, which is slow.
What is needed is a _complete_ cache, so that the fact that FILENAME is
missing there means that no buffer associated with FILENAME is open in
Emacs.

>> `find-buffer-visiting' explicitly checks for `buffer-file-truename'.
>> So, if the cache does not account for `buffer-file-truename', there will
>> be divergence between the existing code and when using the cache.
>> 
>> Same argument for `buffer-file-number'
>
> As I said, we could have hash-tables for these as well, if that is
> needed.  But I'd like to see the profiles that indicate we do need
> them.

I hope that the above clarified why I want to cache everything.

>> Most of the time was taken by `find-buffer-visiting'. Replacing
>> `find-buffer-visiting' with `get-file-buffer' in certain (not all)
>> places reduced the total runtime by 30%.
>
> So you are saying that 30% of file-visiting buffers are not found by
> get-file-buffer?  Or is the 30% increase due to file names for which
> there's no corresponding buffer?  If so, does the benchmark indeed
> look for so many buffers that don't exist?

The rough code flow for the profile I attached to the initial message
is: For each of 500 files used to build agenda: (1) check if file is
open in Emacs via `find-buffer-visiting' and open it if not yet open;
(2) search the file to find matching headings to be added to agenda.

The total CPU time spend building agenda from fresh Emacs decreased by
1/3 (~10 seconds) by replacing calls to `find-buffer-visiting' with
`get-file-buffer'. And this replacement did not yet replace every call
to `find-buffer-visiting' (in particular, find-file-no-select by itself
also calls `find-buffer-visiting'; I replaced no more than half of the
calls only). I estimate that over half of the 30 seconds building agenda
was spent repeatedly searching over all the buffers.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]