Hello,
Sorry if this is not the right place to post, feel free to redirect me as needed.
While helping someone for a projectile issue (
https://github.com/bbatsov/projectile/issues/1480), it seems that when `shell-command-to-string` tries to execute `git ls-files -zco --exclude-standard` over TRAMP on a repository that has 85K files it takes forever to complete.
Here's a stacktrace:
We see that `tramp-wait-for-output` calls `tramp-wait-for-regexp` which calls `tramp-check-for-regexp`, and when looking at the source:
(defun tramp-wait-for-output (proc &optional timeout)
"Wait for output from remote command."
(unless (buffer-live-p (process-buffer proc))
(delete-process proc)
(tramp-error proc 'file-error "Process `%s' not available, try again" proc))
(with-current-buffer (process-buffer proc)
(let* (;; Initially, `tramp-end-of-output' is "#$ ". There might
;; be leading escape sequences, which must be ignored.
;; Busyboxes built with the EDITING_ASK_TERMINAL config
;; option send also escape sequences, which must be
;; ignored.
(regexp (format "[^#$\n]*%s\\(%s\\)?\r?$"
(regexp-quote tramp-end-of-output)
tramp-device-escape-sequence-regexp))
;; Sometimes, the commands do not return a newline but a
;; null byte before the shell prompt, for example "git
;; ls-files -c -z ...".
(regexp1 (format "\\(^\\|\000\\)%s" regexp))
(found (tramp-wait-for-regexp proc timeout regexp1)))
.... snip ...
My understanding is that it does a loop that reads a bit of what the commands outputs then tries to parse end of lines (or '\0') and repeats until the process died or that it found one. Because the command returns a huge string (85K files), this process of read-regexp-repeat takes all the CPU (compared to reading the whole chunk in one go and then trying to check for the regexp).
My questions are the following:
- Did I understand the problem right? Is this something known?
- Is there something to be done about this? Or maybe it would it require too much refactoring / faster implementation?
Kind regards,
Philippe