bug#38104: 27.0.50; elixir-mode fontification is very slow

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#38104: 27.0.50; elixir-mode fontification is very slow

From:	Dmitry Gutov
Subject:	bug#38104: 27.0.50; elixir-mode fontification is very slow
Date:	Wed, 27 Nov 2019 23:58:46 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0

Hi Mattias,

On 26.11.2019 21:32, Mattias Engdegård wrote:

As it turned out, rx is fine (now); elixir-mode, not quite. In elixir-mode.el, 
we have

       (identifiers . ,(rx (one-or-more (any "A-Z" "a-z" "_"))
                           (zero-or-more (any "A-Z" "a-z" "0-9" "_"))
                           (optional (or "?" "!"))))

First, this regex is suboptimal: the first character of an identifier should 
occur exactly once, or you get bad backtracking behaviour. Just remove the 
one-or-more construct:

       (identifiers . ,(rx (any "A-Z" "a-z" "_")
                           (zero-or-more (any "A-Z" "a-z" "0-9" "_"))
                           (optional (or "?" "!"))))

This definition is then used in several places, but two in particular are of 
interest to us:

     ;; Module attributes
     (,(elixir-rx (and "@" (1+ identifiers)))

The construct (1+ identifiers) was perhaps meant to match multiple identifiers, 
but it doesn't (no separator); it just matches an identifier in several ways, 
which again leads to bad backtracking behaviour.
The same problem here:

     ;; Map keys
     (,(elixir-rx (group (and (one-or-more identifiers) ":")) space)

Remove the 1+ and one-or-more and it's fast again.

That makes a lot of sense. I removed these one-or-more's and 1+ (and afew others), and it became fast again.


I'll send a patch upstream. Thanks for your help!

(Looking at the tracker, they have a minor version of this changesubmitted already).

Why did this "work" with the old rx implementation? Because that code had a 
nasty bug: it does not bracket definitions in rx-constituents properly. Example:

(let ((rx-constituents (cons '(hello . "HELLO") rx-constituents)))
   (rx-to-string '(1+ hello) t))
=> "HELLO+"

The new rx implementation does not suffer from this bug.

The result in your case is that the old rx, when translating (1+ identifiers), only 
tacked the "+" onto whatever regexp 'identifiers' produced, resulting in

"[A-Z_a-z]+[0-9A-Z_a-z]*[!?]?+"

which is a lot faster, since only the final [!?] is repeated twice (and it 
probably doesn't match very often).

It's funny to think how someone probably beaten the current code intosubmission by trial and error.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#38104: 27.0.50; elixir-mode fontification is very slow, Dmitry Gutov, 2019/11/07
- bug#38104: 27.0.50; elixir-mode fontification is very slow, Dmitry Gutov, 2019/11/26
  - bug#38104: 27.0.50; elixir-mode fontification is very slow, Dmitry Gutov, 2019/11/26
  - bug#38104: 27.0.50; elixir-mode fontification is very slow, Mattias Engdegård, 2019/11/26
    - bug#38104: 27.0.50; elixir-mode fontification is very slow, Dmitry Gutov, 2019/11/26
  - bug#38104: 27.0.50; elixir-mode fontification is very slow, Mattias Engdegård, 2019/11/26
    - bug#38104: 27.0.50; elixir-mode fontification is very slow, Dmitry Gutov <=

Prev by Date: bug#38354: 27.0.50; Implement display action display-buffer-in-tab
Next by Date: bug#38407: 27.0.50; infinite loop with display of large file without newlines
Previous by thread: bug#38104: 27.0.50; elixir-mode fontification is very slow
Next by thread: bug#31048: 27.0.50; With -daemon, "emacsclient -c" on a modified file fails to display frame
Index(es):
- Date
- Thread