emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: emacs rendering comparisson between emacs23 and emacs26.3


From: Dmitry Gutov
Subject: Re: emacs rendering comparisson between emacs23 and emacs26.3
Date: Tue, 21 Apr 2020 04:41:42 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1

Hi Alan,

On 21.04.2020 00:19, Alan Mackenzie wrote:

Is C++ syntax so ambiguous? Can R"( mean something else?

C++ syntax is ambiguous in several places, but not here.  R"foo(,
assuming it's not itself inside a literal, means exactly one thing.  But
you don't know whether it starts an unterminated string unless you seach
for a closing delimiter.

Then you can put a certain syntax-table value on it either way. Not (string-to-syntax "\""), though.

This affects the font locking.  (An unterminated opening delimiter
gets font-lock-warning-face, a terminated one doesn't.)

If everything after R"( is fontified as a string, it serves as a
"warning" of sort as well.

That means putting syntax-table properties on every " after the R"foo(,
otherwise that string would only extend to the first such ".

Nope. You will put (string-to-syntax "|") on the the raw string opener. It doesn't match "ordinary" quotes.

Same procedure for a simple string - if it's a terminated string the "
gets font-lock-string-face, if it's not it gets f-l-warning face.

As discussed in nearby messages, this should be doable in a generic way outside of CC Mode.

I mean that if a raw string is unterminated, the default behavior should
be to fontify the rest of the buffer as string. But then again, you
could choose some different highliting in font-lock rules.

The current strategy is to fontify the unterminated R"foo( with
warning-face, and let the devil deal with the rest of the string (i.e. no
attempt is made to apply syntax-table properties).  The first portion of
the raw string will indeed get string-face.

As soon as the closing delimiter is typed, the warning-face is removed
from the opener and syntax-table text properties applied throughout the
string.  The entire string then gets string-face.

That would change. But within limits that are acceptable for you, hopefully.

It can't become empty.  after-change-functions is fine for dealing
with insertions, but can't do much after a deletion.  Consider the
case where you're in a string and all you know is that 5 characters
have been deleted.  Those characters might have been )foo", so after
checking the beginning of the string starts off with R"bar(, you've
then got to scan a long way forward looking for )bar".  Effectively
every deletion within a string would involve scanning to the end of
that string.

This is an example of extra complexity you have to retain to implement
the above feature the way you do.

It will become more complex and slower, if information from
before-change-functions is ignored, or discarded.  The alternative is,
after each deletion, to scan forward checking that the terminating
delimiter still exists.  This is slower and more complicated than
checking in b-c-f whether it's about to be removed.

Removal of micro-optimizations will make things slower only when they apply to actual bottlenecks. And those can change with the change of the approach.

It's probably also an example of how before/after-change-functions
essentially duplicate the knowledge of language syntax. I'm guessing
here, but to make it work like that, you need to have multiple functions
"understand" the raw string syntax.

b/a-c-f implement the language syntax.  It's one of the places the
language is codified.  The mechanism is in several functions, yes.  If
you're interested, go into cc-engine.el and search for "raw string".

Your confirmation is good enough for me for now, thank you.

Whereas with syntax-propertize-function, that knowledge is concentrated
in one place (maybe two, if font-lock rules do something unusual). This
way, the code is simplified.

No, it gets complicated, assuming no loss of functionality.  A given
amount of functionality would get squashed into a smaller place.  The
current implementation (of C++ raw strings) is optimised such that normal
insertion and deletion don't cause the s-t properties on the entire
string to be modified.  That requires details of the buffer before the
insertions and deletions.

True. So maybe Emacs will do "extra" work. It still might turn out to be faster for most user interactions.

In principle, the speed-up will come from:

- Deferred execution (where several buffer changes can be handled
together and not right away),

I've never been wholly convinced by laziness.  Sooner or later these
changes need to be handled, and delaying them is not going to accelerate
them.

It's the difference between keeping the whole buffer up-to-date and only doing that for a limited range of chars.

- No parsing the buffer contents much farther than the current window,
in most cases. Which can speed up the majority of user actions. The
exceptions will remain slower, but that is often a good tradeoff.

This will involve loss of functionality, as already noted.  And bugs;
whilst typing in normal text, CC Mode has to search backwards for a safe
place, otherwise context fontification can mess things up.  This is an
area where optimisation would be useful.

That's where syntax-ppss comes in.

And I'm not sure where the proof of the syntax-propertize mechanism
being helpful is.  Has anybody but its originator positively chosen
to use it, whilst being aware of the alternatives?

The alternatives being reinventing the relevant logic from zero in each
major mode? And writing syntax caching logic each time?

Or writing and using a better framework.

Well, um. You didn't write a generic framework that would help all major mode writers yourself. It's a bit too much to expect that from others either.

The question remains: has anybody other than Stefan M. freely chosen to
use syntax-ppss and syntax-propertize-function, whilst being aware of
their disadvantages and of alternatives?

I'm not him.

Remember, that for an extended period of time syntax-ppss didn't work
properly, and even now it doesn't do the right thing in narrowed buffers,
at least for a programming mode such as CC Mode.  The syntax-propertize
mechanism erases s-t p's in a manner not under the control of the major
mode, which means the major mode needs to implement workarounds (which
are liable to be slow).

Narrowing is a tricky thing. I'd wager to say that the vast majority of our users don't use it, or use it in very specific ways. So any related bugs might go unnoticed or take quite some time to be uncovered.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]