bug#11822: 24.1; emacsclient terminal mode captures escape characters as

bug-gnu-emacs
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#11822: 24.1; emacsclient terminal mode captures escape characters as

From:	Ken Raeburn
Subject:	bug#11822: 24.1; emacsclient terminal mode captures escape characters as text
Date:	Fri, 11 Sep 2015 19:11:30 -0400
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/24.3.93 (gnu/linux)
Eli Zaretskii <eliz@gnu.org> writes:

>> >> [...preempting redisplay...]
>> > We had this ability in the past, but we all but deleted it, as it
>> > seemed not to make redisplay more responsive. [...]
>> 
>> > But it could be that we didn't look at the effect of this when frames
>> > are displayed via X over slow networks.  So please try experimenting
>> > with an Emacs before the deletion (I think 24.4 is old enough), and
>> > see if setting redisplay-dont-pause to nil helps in your case.  If it
>> > doesn't, then what you suggest above is probably not an idea that will
>> > yield tangible benefits.
>> 
>> Interesting.
>> 
>> Hmm... looking at the code (24.3.93 is what I have handy), it looks like
>> update_frame is the interesting point where this is checked, which is
>> called from redisplay_internal after the point where prepare_menu_bars
>> currently triggers recompute_basic_faces calls across the frames.
>> 
>> So if I understand this right, enabling this preemption (setting
>> redisplay-dont-pause to nil) would let new user input preempt redisplay
>> of some frames, but only after prepare_menu_bars has already caused the
>> color-related round-trips to happen. That seems to be the slow part, so
>> I'm not sure what I should expect to see happen differently.
>
> It should avoid redisplaying other frames when input is available
> after redisplaying the first one.  Your original complaint was (and
> still is, as far as I'm concerned) that Emacs was unnecessarily trying
> to redisplay frames on another display, which caused undue network
> round-trips for communicating with X.  Setting that variable to nil
> should at most only redraw the selected frame when user input is
> available.  If you perform this experiment when the frame(s) in need
> of network communications are on the inactive display, you should see
> whether the effect is tangible.  If it isn't, trying to refrain from
> displaying frames on other displays, per one of your suggestions, will
> not be very effective, perhaps not at all, and we need to look for
> other ways to optimize this use case.

Well... if by "redisplay" we include both the face realization and text
drawing, yes, that's my complaint, but no, I don't think the flag will
avoid *all* of the redisplay mechanism for frames after the first, only
the text-drawing part. If we get as far as redrawing text on any of the
frames, including the selected one, it looks to me like by that point
we've already updated faces on any frames for which we think we need to,
and that's the part that makes it slow.

So I'd expect it to possibly "save" an unnoticeable amount of time
during which it previously would've sent off text updates to other
frames without waiting for any X server replies.

(Also: Iconifying the remote frame doesn't seem to help. In
prepare_menu_bars, when all_windows is set but some_windows is not, the
call to x_consider_frame_title, which can trigger face realization, *is*
called for iconified frames.)

>> >> >> Would changing sizes for a face cause the face to be recomputed from
>> >> >> scratch?
>> >> >
>> >> > It doesn't in my testing (I tried "C-x C-+").  You can easily try that
>> >> > yourself: put a breakpoint on recompute_basic_faces, and see if it
>> >> > breaks when you change the face size.
>> >> 
>> >> I tried it in the scratch buffer in a new Emacs process. It doesn't call
>> >> recompute_basic_faces, but it did call realize_face twice, and
>> >> XParseColor and x_alloc_nearest_color_1 each four times. So that's eight
>> >> round trips that seem unnecessary as we should already have the color
>> >> definitions and allocated color cells.
>> >
>> > For how many frames were XParseColor and x_alloc_nearest_color_1
>> > called?  If only for the single frame where you've changed the face,
>> > then it's expected.
>> 
>> In that test I had only one frame.
>
> Then these calls cannot be avoided with the current design of face
> realization.  IOW, creating a single frame where communications with X
> are over a slow network will always be slow, unless we radically
> change the design and implementation of faces.

We can reduce the number of round trips. If we only ever use 13 distinct
named colors, in an ideal world we could make do with 13 round trips to
allocate them in the colormap. But then we need an additional level of
reference counting or garbage collection on the client (Emacs) side.

>> > If you are suggesting to be more selective wrt what exactly needs to
>> > be recomputed during face update, then this will need some analysis
>> > regarding which parts are more expensive than others, and introduction
>> > of corresponding flags to signal which face aspects need to be
>> > recomputed.  Assuming this is even possible without a more or less
>> > complete rewrite of face-related code (which currently just throws
>> > away a face and realizes it anew), the relative cost (in terms of
>> > time) of recomputing each aspects will most probably be different for
>> > different display back-ends, perhaps even for different network
>> > bandwidths.  Someone™ should do this research and publish the results,
>> > before we could start designing a good solution.
>> 
>> I don't think I'd try anything that fancy. Realizing faces from scratch
>> is probably fine as long as that can be made fast enough in most
>> reasonable cases.
>
> ??? Doesn't this contradict with your measurements above, where X
> calls for even a single frame take too long?  How can we make this
> faster without changing how faces are realized?  What am I missing?

Sorry, bad editing job. I initially wrote "I don't think I'd try
anything that fancy at first." Start with reducing unnecessary frame
updates and face cache invalidations (reducing the number of face
realization calls), and redundant color lookups (which could make
realizing any single face anew faster), and only if that's not enough
should we consider more complicated stuff like the above.

>
>> >> > Given this general description, what would "lower priority" mean in
>> >> > practice?
>> >> 
>> >> Reorder the frame traversal.
>> >
>> > Since the goal is to limit redisplay to a single frame, the current
>> > one, I think this is a moot point.
>> 
>> Limiting redisplay to one frame when possible will go a long way. There
>> will still be cases that need to update multiple frames (like changing
>> faces used on multiple frames), but they may be infrequent enough that
>> we don't need to worry about it.
>
> I learned the hard way to solve problems one at a time, starting with
> the one that gives the most benefit.

I think you're right... I see multiple types of behavior that are more
sluggish than I'd like, and what look like multiple possible
optimization areas, but some of them are overlapping (not all, I think),
and it's just adding confusion by mixing them all up in the discussion
at once. I should just keep a list of things to revisit later, or at
least as separate issues not linked to this bug report. (New bug
reports? Emacs-devel threads?)

>> >> [over 200 color-allocation calls]
>> In case I wasn't clear, the numbers above were for creating the initial
>> frame
>
> Then please help me understand why creating a single frame needs so
> many color-related calls on X.  I know very little about X GUI
> performance, and the person who was our X expert is no longer on
> board, sadly.

I don't know yet, but it's on my list. :-)

I'm no X performance expert either, but some of the relevant bits I
remember from looking at X11 in the past:

 * Many requests, including text drawing, need no reply. The client
   sends the request and continues about its business. Maybe an error
   response comes back later, maybe not. A "Sync" request is available
   if the client needs to know that the server has completed everything.

   So text updates, unless they're big enough to fill socket buffers, or
   cause so much work for the server that they hold up processing of
   later requests the client needs responses to, aren't dependent on
   network latency. The client can send "draw these strings at these
   positions" and then go off and do something else.

 * Some requests do require replies from the server, like color lookup
   (return RGB data from server-side database) or color cell allocation
   (return pixel ID), and the library is not generally designed to allow
   for pipelining of these requests; it'll wait for the reply before
   returning control to the application.

 * Color allocations on the server are reference-counted. Multiple
   allocations for the same color will (likely) return the same pixel
   value, and require multiple "free" requests to make it considered
   unused again.

 * Some types of displays are limited in the number of colors they can
   show, so being a good X neighbor involves freeing colors (the correct
   number of times) if you're done with them and not about to shut down
   the connection, unless you're using a private colormap, in which case
   the only foot you might shoot is your own. I'm not sure if these
   display types are at all common any more.

For a local X server, the round trip time is cheap enough that it's
probably good enough to generate all the extra traffic and accept the
(very brief) waits for replies, and just let the X server deal with
maintaining the reference counts on color cells.

>> I've hit other cases that don't involve multiple frames. Tooltip window
>> popups involves way too many round trips, and highlighting and
>> un-highlighting parts of the mode line as the mouse is moved through it
>> can sometimes (not sure what the circumstances are) trigger color
>> queries on every change.
>
> Please tell which calls take the lion's share of time in these
> scenarios, and please show the backtraces for those calls.

Also on my list.

>> >> Eliminating unnecessary cache clearing might reduce these, maybe by a
>> >> factor of two or more, but that's still excessive
>> >
>> > What is the estimation of "factor of two or more" is based on?  Why
>> > not by an order of magnitude, for example?
>> 
>> My email a few days ago with the gdb stack traces showed three expensive
>> calls to recompute_basic_faces (plus three very cheap ones) in setting
>> up the initial frame.
>
> I asked back then why there are multiple calls, since the first call
> resets the face_change flag.  I asked you to try to figure out which
> code sets the flag again.  I'd still appreciate the answer to that.

Sorry, I must have overlooked that.

>> Three might be a better estimate than two, but I would be very
>> surprised (but pleased) to get much more than that this way.
>> 
>> There's no other frame at this point for there to be queries generated
>> for, so any code changes from "recompute on all frames" to "recompute on
>> the current frame" probably won't change this.
>
> You forget that setting the face_change flag also causes a complete
> redrawing of a frame, so avoiding that might produce more benefits.
>
> IOW, until we really measure the difference, we will never know what
> is the factor.  It could even be (a disappointing) 1.1.

Yes, I was just guessing at those numbers.

>> I was also thinking of the round-trips involved in the image (tool-bar
>> icon) handling, where x_disable_image caused a lot of round trips, but
>> that's not actually XParseColor and XAllocColor, it's XQueryColors, and
>> I haven't looked at whether there's redundant work there. Given that
>> there were nearly 100 round trips, I hope there is, since then we might
>> be able to eliminate some of them.
>
> Please show the relevant data.  Images are also cached, AFAIR, so I'd
> expect not to see unnecessary calls.

That's why overall I'd expect to see a poorer-than-3 reduction in round
trips. I originally erred in conflating this with color-handling
requests specifically.

>> The color lookups are currently done for each face independently. If
>> multiple faces use the same colors, we'll have multiple requests to the
>> X server for those colors.
>
> That should be part of redesigning how faces are realized.  But you
> rejected the idea, so how do you expect this to happen?

Let the generic face code continue to process each face independently.
It doesn't need to know that the X11 layer is providing the information
out of a cache in "struct x_display_info". Such a cache might also help
with lookups from image-processing code too.

>> Consider: 15 faces times at least 2 colors per face (foreground and
>> background, plus maybe box, underline, overline, and strike-through),
>> times 2 round trips per color (LookupColor, assuming not a hex RGB
>> value, and AllocColor), is at least 60 round trips. At 30 ms per round
>> trip that's 1.8 seconds. Using XAllocNamedColor instead of
>> XParseColor+XAllocColor would cut that in half, but it would still be
>> very noticeable.
>
> Out of curiosity: is this all true for the Cairo build as well?  Or
> does that build save us these round-trips?

I've no idea; I'll have to look into that.

Ken
[Prev in Thread]
Current Thread
[Next in Thread]
bug#11822: 24.1; emacsclient terminal mode captures escape characters as text, (continued)
Prev by Date: bug#21464: 25.0.50; [cc-langs] void-function cadar
Next by Date: bug#11822: 24.1; emacsclient terminal mode captures escape characters as text
Previous by thread: bug#11822: 24.1; emacsclient terminal mode captures escape characters as text
Next by thread: bug#11822: 24.1; emacsclient terminal mode captures escape characters as text
Index(es):
- Date
- Thread