[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-readline] Pasting long lines is slow

From: Chet Ramey
Subject: Re: [Bug-readline] Pasting long lines is slow
Date: Tue, 26 May 2015 16:08:41 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

On 5/15/15 2:33 PM, Ole Laursen wrote:

> I did a profile now on examples/fileman with perf:
> https://perf.wiki.kernel.org/index.php/Tutorial#Sampling_with_perf_record
> Here are the results from pasting 5000 x "aaa " with my default locale
> da_DK.UTF-8. As is evident most of the time is spent measuring the
> widths of the a's:

Yes, the redisplay code computes all character widths to determine where to
break lines and manage terminal line wrapping.

>   21,22%  fileman  libc-2.19.so       [.] __gconv_transform_utf8_internal
>   20,60%  fileman  libc-2.19.so       [.] wcwidth
>   19,80%  fileman  libc-2.19.so       [.] __mbrtowc

That's apparently very expensive using glibc on Linux.

> I'm assuming that readline does a redisplay on each inserted char? In
> this redisplay, it goes through all characters on the line inserted so
> far, so the end result is O(n^2).
> In a multi-byte case, examining each character is obviously much more 
> expensive.

Yes, apparently.

> Notice that in even with LANG=C, __ctype_get_mb_cur_max accounts for
> 61.7% of the time. As far as I can tell, this comes from MB_CUR_MAX
> which despite the naming is not a constant:
> http://man7.org/linux/man-pages/man3/MB_CUR_MAX.3.html

Other systems, such as Mac OS X or FreeBSD, manage to avoid a function
call, but it's definitely not guaranteed to be a compile-time constant.
Since it only changes on a successful call to setlocale(), it's possible
to avoid function calls here.  glibc just doesn't, so we should work
around that.

> As a quickfix for that, I'm attaching a patch to only compute
> MB_CUR_MAX once in the top of the function. With this patch, the
> LANG=C profile is noticeable faster and looks like this

Thanks, this is part of the fix for this.  A larger part is to make
the calls to wcwidth cheaper on Linux.

> I suppose one could try to make the loop content faster somehow, but
> to really fix this to not be O(n^2) one should only do the work
> necessary to take the single added character into account instead of
> starting off from scratch each time. Or perhaps try to tackle this
> case of copy-pasting a bucket-load of characters by batching them?

The readline redisplay engine is very decoupled from the buffer
manipulation engine: it gets only the contents of the buffer that need
to be displayed and is responsible for keeping track of the current
physical screen contents, computing the differences, and performing
the screen updates.  It makes no assumptions about what might have
happened to the line buffer between calls to rl_redisplay.

It's pretty efficient: it knows how to suppress redisplay for identical
screen lines and to only append new characters to existing lines when
that makes sense.

There aren't currently any mechanisms to hint that the only difference
between the previous line buffer and the current line buffer is some
appended characters; it has to compute the line(s) that will be displayed
on the screen before it knows that.  As I said above, part of the
process involves figuring out character widths to correctly compute line
breaks.  It might be possible to add an internal interface to tell the
redisplay engine that the only thing changing is adding characters at
the end of the line and change behavior accordingly, but I am not sure
that is worth the effort.  If you're going to paste 20,000 characters,
you might as well spend the extra 5 or 6 seconds to turn off line editing
before you do it.

``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    address@hidden    http://cnswww.cns.cwru.edu/~chet/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]