--- Begin Message ---
Subject: |
[PATCH] slow ENCODE_FILE and DECODE_FILE |
Date: |
Fri, 3 Apr 2020 16:18:43 +0200 |
ENCODE_FILE and DECODE_FILE turn out to be surprisingly slow, and allocate
copious amounts of memory, to the point that they often turn up in both memory
and cpu profiles. (This is on macOS; I haven't checked the situation elsewhere.)
For instance, a single call to file-relative-name, with ASCII-only arguments,
manages to allocate 140 KiB. There are several conversion steps each involving
creating temporary buffers as well as the compilation and execution of very
large "quick-check" regexps. Example:
(progn
(require 'profiler)
(profiler-reset)
(garbage-collect)
(profiler-start 'mem)
(file-relative-name "abc")
(profiler-stop)
(profiler-report))
This applies to just about every function dealing with files or file names.
The attached patch is somewhat conservatively written but at least a starting
point. It reduces the memory consumption by file-relative-name in the example
above to zero. Perhaps we can assume that file names codings are always
ASCII-compatible; if so, the shortcut can be taken in encode_file_name and
decode_file_name directly.
There is already a hack in encode_file_name that assumes that no unibyte string
ever needs encoding; if so, the shortcut could perhaps be extended to
decode_file_name and simplified.
0001-Avoid-expensive-recoding-for-ASCII-identity-cases.patch
Description: Binary data
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE |
Date: |
Sat, 11 Apr 2020 17:09:27 +0200 |
I think we are done here -- now that all calls to ENCODE_FILE and DECODE_FILE
have been checked to be safe for no-copy semantics, there is no need to copy in
the ASCII identity case; pushed to master.
--- End Message ---