bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#69385: 30.0.50; Long lines with bidi text slow down Emacs


From: Stephen Berman
Subject: bug#69385: 30.0.50; Long lines with bidi text slow down Emacs
Date: Thu, 07 Mar 2024 12:12:51 +0100
User-agent: Gnus/5.13 (Gnus v5.13)

On Mon, 04 Mar 2024 16:43:26 +0200 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Cc: 69385@debbugs.gnu.org
>> Date: Mon, 04 Mar 2024 14:28:50 +0100
>>
>> On Sun, 03 Mar 2024 17:18:31 +0200 Eli Zaretskii <eliz@gnu.org> wrote:
[...]
>> >                         I hope you are editing those files with
>> > embedded Arabic frequently enough for these changes to be exercised.
>>
>> As I mentioned previously, my real files are programmatically generated
>> elisp files (so base paragraph direction LTR) not meant to be manually
>> edited or even just viewed by users of the program, and I haven't edited
>> them manually, and normally wouldn't.  But I just now ran the
>> end-of-buffer benchmark on one of them (the one I described previous,
>> containing a vector of 827 lists of bidirectional strings in a single
>> line), with this result:
>>
>> (0.849369497 4 0.05337466599999996)
>>
>> This was the timing without your patch:
>>
>> (9.308704995000001 4 0.054923504999999984)
>>
>> So for this file your patch yields "only" an almost 11 times faster
>> benchmark.  For navigation besides M-> and M-<, I find C-v, M-v, C-n,
>> C-p in the buffer visiting this file still very slow (noticeably more
>> than in the test buffers) and holding them down still freezes Emacs
>> (with C-n and C-p for many seconds) and uses 100% of a CPU core; though,
>> while I haven't tried timing these yet, my impression is that the
>> freezes are not as long as the ones I observed without your patch.
>> Also, there is still a marked delay when entering the minibuffer with
>> M-x or M-: or when switching to another buffer with C-x b, though
>> impressionistically no worse than the delays without your patch.  I'll
>> try to do more testing.
>
> Thanks for testing.  The above matches what I see on my system.  C-n
> and C-p is known to be problematic in long lines, but these changes
> speed them up as well, although perhaps not as well as the other
> commands.
>
>> > If you see no problems after a week or two, I will install this.
>>
>> Thanks.
>
> So I will wait for you to report any problems, and if no problems are
> seen, will install in a week or so.

I haven't yet run into any issues concerning your patch, but I have
encountered a problem with another one of my generated files, which,
though independent of your patch (the problem also happens in emacs-29),
is an issue for bidirectional text in Emacs, so might be worth trying to
handle better.  If you want, I can open a separate bug to pursue this
issue, but for now I'll summarize what I've observed so far.

Most of the Arabic words in the problematic file are enclosed in the
bidirectional control characters POP DIRECTIONAL FORMATTING (#x202c) and
RIGHT-TO-LEFT EMBEDDING (#x202b).  I did not add these characters, but I
had copy-&-pasted most of the Arabic from a PDF file I did not create.
I don't know if PDFs of Arabic text normally contain these control
characters, but the consequences for Emacs were dramatic.  When I simply
visited this file in Emacs (started with -Q) there was an immediate
slowdown, and in top I could see Emacs using 100% of a CPU thread.  When
I ran the end-of-buffer benchmark on this file, the result (with your
patch) was:

(27.962602113 2 0.0226042269999999977)

However, the display of that result only appeared in the echo area after
more than a minute (I timed it with a stopwatch).  At that point the
mode line showed the buffer at 4% from the top, and the display remained
frozen afterwards.  After several minutes during which Emacs consumed
100% CPU, and I had switched the focus away from the Emacs frame, the
CPU consumption stopped, but as soon as I switch focus back to that
frame, it went back to 100%.  The display never changed from showing the
buffer at 4%, apparently being in some kind of infinite loop.  After
about 15 minutes I started gdb, attached the Emacs process and produced
a backtrace, which I've attached, in the hope it helps to diagnose the
problem.

The problem seems to be certainly related the the bidirectional control
characters, because I made a copy of the file and removed all
occurrences of these control characters from it, and then ran the
end-of-buffer benchmark, getting this result (with your patch):

(0.716104165 4 0.04223660400000001)

And the display updated normally and CPU consumption was normal.

Nevertheless, there seems to be something else besides the control
characters involved in this issue, because as a futher test, I created a
buffer consisting of more than 1000 copies of the test string
concatenating the Arabic example in etc/HELLO and "Hello", and manually
enclosed each Arabic word in the above control characters, but the
benchmark result in this buffer was not significantly different from the
result without the control characters (and similar to the above result
for the copy of the problematic file without the control characters),
and the display did not freeze.

Steve Berman

Attachment: txtoGHSR1bmsu.txt
Description: gdb backtrace


reply via email to

[Prev in Thread] Current Thread [Next in Thread]