bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16800: 24.3; flyspell works slow on very short words at the end of b


From: Aleksey Cherepanov
Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file
Date: Mon, 24 Feb 2014 03:02:51 +0400
User-agent: Mutt/1.5.21 (2010-09-15)

I've performed some tests against my .org file (not in emacs -Q):

(insert
 (mapconcat (lambda (re)
              (save-excursion
                (let ((time (current-time))
                      (count 0))
                  (while (re-search-backward re nil t)
                    (setq count (1+ count)))
                  (format "%d: %S :: %s" count (subtract-time (current-time) 
time) re))))
            '("\\<[[:alpha:]]"
              "\\b[[:alpha:]]"
              "\\([^[:alpha:]]\\|\\b\\)[[:alpha:]]"
              "\\([^[:alpha:]]\\|\\`\\)[[:alpha:]]"
              "\\(?:[^[:alpha:]]\\|\\`\\)[[:alpha:]]"
              "\\(?:[^[:alpha:]]\\)[[:alpha:]]"
              "[^[:alpha:]][[:alpha:]]"
              "\\(?:\\b\\|'\\)[[:alpha:]]"
              "\\(?:[^[:alpha:]]\\|\\`\\)\\([[:alpha:]]+\\)"
              "\\([^[:alpha:]]\\|\\`\\)\\(?:[[:alpha:]]+\\)"
              "\\([^[:alpha:]]\\|\\`\\)[[:alpha:]]+")
            "\n"))

Matches| Time              | Regexp tried
299158: (0 2 841190 614000) :: \<[[:alpha:]]
299158: (0 2 876846 547000) :: \b[[:alpha:]]
307919: (0 3 321676 163000) :: \([^[:alpha:]]\|\b\)[[:alpha:]]
307899: (0 3 291931 838000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]
307899: (0 2 821347 257000) :: \(?:[^[:alpha:]]\|\`\)[[:alpha:]]
307899: (0 2 760125 839000) :: \(?:[^[:alpha:]]\)[[:alpha:]]
307899: (0 2 765410 758000) :: [^[:alpha:]][[:alpha:]]
299518: (0 2 998895 976000) :: \(?:\b\|'\)[[:alpha:]]
307899: (0 3 174172 939000) :: \(?:[^[:alpha:]]\|\`\)\([[:alpha:]]+\)
307899: (0 3 250515 907000) :: \([^[:alpha:]]\|\`\)\(?:[[:alpha:]]+\)
307899: (0 3 218270 136000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]+

I should admit that word search breaks things even for setup with
[[:alpha:]]: a0a is 1 word for emacs and 2 for flyspell. I missed it
because Russian behaves differently (there is word boundary on border
between digits and Russian letters). My bad.

307899: (0 2 760125 839000) :: \(?:[^[:alpha:]]\)[[:alpha:]]
307899: (0 2 765410 758000) :: [^[:alpha:]][[:alpha:]]
These two suggest that it may provide a speed up if we do not check
beginning of buffer in regexp but check it separately. But I doubt it
is worth it.

On Sun, Feb 23, 2014 at 11:56:59PM +0400, Aleksey Cherepanov wrote:
> Also not capturing group ("\\(?:") could be used because we do not
> need a match data of the first group. It should work faster but I
> don't really know.

307899: (0 3 291931 838000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]
307899: (0 2 821347 257000) :: \(?:[^[:alpha:]]\|\`\)[[:alpha:]]
The test shows that not capturing group is faster.

> Maybe it would be faster to not capture word but capture one char or
> void but I doubt the difference would be noticable.

307899: (0 3 174172 939000) :: \(?:[^[:alpha:]]\|\`\)\([[:alpha:]]+\)
307899: (0 3 250515 907000) :: \([^[:alpha:]]\|\`\)\(?:[[:alpha:]]+\)
307899: (0 3 218270 136000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]+
Unexpectedly capturing of word works a bit faster. Maybe it is not a
word but the second group and it would work differently for search
forward. Or alpha+ instead of fixed word caused it. Anyway the
difference is very small.

Capturing word allows us to make a function to wrap a word into regexp
like word-search-regexp function wraps a word for
word-search-forward/-backward functions.

> I guess that \b would work faster than the group so we could have 'if'
> statement around the whole loop that has one implementation with \b
> for case when casechars are "[[:alpha:]]" and not-casechars are
> "[^[:alpha:]]" and another implementation as above for other cases.
> But it seems cumbersome.

My guess is wrong: \b works slower than the group. Also it is
inappropriate at all.

Thanks!

-- 
Regards,
Aleksey Cherepanov





reply via email to

[Prev in Thread] Current Thread [Next in Thread]