bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#44007: 26.3; Strange shell mode performance


From: Herman
Subject: bug#44007: 26.3; Strange shell mode performance
Date: Mon, 06 Nov 2023 14:37:52 +0100


Eli Zaretskii <eliz@gnu.org> writes:

The bug is that if echo is enabled, emacs 'shell' gives a much
worse performance.

I'm trying to understand why would we consider that a bug. There will always be ways to make Lisp programs run slower or faster by tweaking some variables, and the defaults are not always capable of providing
the best performance in every possible situation.

I think it is a bug because Emacs doesn't try to read again if one read() was successful. My hacky solution fixes that. With the extra cost of one read syscall. But we can do better: we can avoid it if the first read call already read 'nbyte' bytes, then there is no need to call another read().

Also, it a bug, because it's not logical (what does tty echo have anything to do with the performance of reading process output?), and the effect it causes is huge. We aren't talking about something like 10% difference, but 1500%: 3 sec vs. 0.2 sec.

You (or anyone else) should of course feel free to investigate and try to find out what are the specific reasons which make Emacs slower in
some cases and faster in others.  But that doesn't necessarily
constitutes a bug, IMO.
I already did, see my comment about read() returning with 4095, instead of 4096. Not a deep investigation, because I stopped investigating it when I realized that read() is not called in a loop, which is clearly a bug to me (as far as I understand how read should be used according to POSIX. I'm not a unix hacker, I may be wrong).

Tbh, I don't really understand the idea behind process-adaptive-read-buffering. Why is this setting there? If a process generates lot of output, it doesn't do anything useful. If a process is slow, then this approach may save some system calls (and other processing in emacs), but as the process slowly generates output, it should already be quick to process. When does it do something useful in a real world usage? Some packages had to turn this setting off because they noticed that it makes performance worse. If its usefulness is OS dependent, maybe it'd make sense to enable it only on platforms where it make sense?

I'd like to also mention that my hacky solution not just fixes the
problem, but also improves shell's performance, even with the
mentioned variables set.  As far as I know POSIX programming,
read() should always be called in a loop, if one wants to read N bytes (even in the successful case). Emacs doesn't have this in
emacs_intr_read, that causes the problem.

We've arrived at the current code after a lot of iterations, and it has to handle many different uses. It doesn't surprise me a bit that you can find a change which speeds up Emacs in some particular case,
but what about all the other cases?
If the intention of emacs_intr_read is to read nbytes bytes (if that many is available), then the current code doesn't do that correctly. This is not about handling different uses cases, but simply using read() correctly. In the worst case, and OS is allowed to give data byte by byte, where each read() call returns just one byte. In this case, Emacs will read the output one byte per render frame (or something like that, or I don't know how emacs works in this regard). I think something similar happens here: 4095 bytes is returned, because supposedly originally it was 4096, but something in the kernel/libc/whatever shaved off 1 byte for some reason (that's just my theory). But returning with 4095 doesn't mean that's all that available. Emacs could read further, but it doesn't.

How about if you run with this change for a few weeks or so, and see if it doesn't cause problems elsewhere? maybe also others will try using it in their use patterns and report back. Then we will at least
have some data regarding the effects of the change you propose.
I've been using this modification since I created this bug (many years). I didn't notice any problems with it.

Of course, if someone comes with a less hacky proposal, it would be
even better.
I just call it hacky, because I just threw it together in several minutes or so. But in my opinion that idea is OK (read as long as possible). It's just the implementation that could be improved (don't call read() again if nbytes already arrived, and there are maybe other things).

Also, please be sure to try Emacs 30, as I believe some changes were
done there recently.
I've tried a 1-week-old master, the exact same thing happens.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]