bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16800: 24.3; flyspell works slow on very short words at the end of b


From: Agustin Martin
Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file
Date: Sun, 9 Mar 2014 18:25:46 +0100

2014-03-01 0:11 GMT+01:00 Aleksey Cherepanov <aleksey.4erepanov@gmail.com>:
On Fri, Feb 28, 2014 at 12:45:45PM +0100, Agustin Martin wrote:
>
> Please find attached my first candidate for commit. Is similar to what I
> sent before, but needed to add an explicit check for word at eob in
> `flyspell-word-search-forward'.
>
> Will try to have more testing before committing. Seems to work well with the
> file generated by your one-liner, even with corner cases like new
> misspellings added at bob or eob, but the wider the testing the better.

I've wrote a small fuzzer. It is in attach. To run it:
$ LANG=C emacs -Q --eval '(load-file "t2.el")'
Then C-j to start. It modifies buffer you are in.

Your -forward function gets stuck. (kbd "nd SPC and C-a") could repeat
it. my-test-forward-agustin-fixed contains fix. It incorporates
simplified word-end logic: we slip forward using flyspell-get-word,
then we check eobp. Though I did not understand why -backward does not
need a similar fix and I got the answer: my mistake with (length word)
did not allow one word to be marked as duplicate.

Hi,

Sorry for the delay, I am rather busy lately.

The problem is that final condition can cause an endless loop. I'd prefer to use a flag to make sure that test is passed only once as much, see below
 
(if condition nil ...) could be replaced with (unless condition ...)
but I do not know what one is more readable.

While there was `nil' in the version I attached, the actual code in my personal debug code was not exactly `nil' but something like

(progn (message "word %s is at eob" word) nil)

Together with being more readable, that is why the structure was left there. I have to thing about this, but with the full word-re test we may even get rid of that test and all the flyspell.-get-word call, but I need some time to test all that. If still needed (unless ...) may be used in the final version.
 
(kbd "nd SPC and SPC nd C-b") fails to highlight the second "nd" as
duplicate. It is a problem with bound equal to (length word) in
-backward function. I did not check it when I wrote it.
> +                      (search-forward word (length word) t))))
                         (search-forward word (1+ (length word)) t))))

One "nd" is colored as duplicate due to -backward function after that
fix. I did not touch it yet because it is a time for a break for me.

FIxed in my copy. Later noticed that backward function has some other glitches, like flyspell-get-word looking at wrong word. Fixed in my copy, but I will look at it later.

> Hope no one will generate files with words containing something in
> OTHERCHARS.

Why?

Sorry for not being clear, I was thinking about a really rare corner case current code may not handle well

Write 'nd-et' some thousand times and then write 'et' in a language where 'nd' and 'et' are misspellings and '-' is part of otherchars. I'd expect to have a lot of false positives in search.
 
BTW quite interesting flyspell behaviour could be observed with
"met'met'and": if you jump back and forth over this word then met'met
is highlighted when you are at the beginning and met'and is
highlighted when you are at the end.

Also "met'met'and met'and" highlights both met'and as mis-spelled (the
second met'and is not marked as duplicate).

Funny :-)

I do not expect to have time to check everything shortly. Also, note that Emacs 24.4 is on the way, so trunk is frozen for anything potentially problematic. Unless I consider the code absolutely rock solid, I'd prefer to wait until 24.4 is released.

Sorry I could not yet play with your fuzzer, will try next week. Thanks for all the work you are putting here.

This is what I am using to break the loop in the forward function (see `keep'). min limit in last condition should also be adjusted to bound if not unlimited.

;; --------------------------- 8< ----------------------------------------------------------
(defun flyspell-word-search-forward (word bound)
  (save-excursion
    (let* ((r '())
       (inhibit-point-motion-hooks t)
       (word-end (nth 2 (flyspell-get-word)))
       (flyspell-not-casechars (flyspell-get-not-casechars))
       (word-re (concat flyspell-not-casechars
                (regexp-quote word)
                flyspell-not-casechars))
       (keep t) ;; Control flag to exit loop below after check word at eob.
       p)
      (while
      (and (not r)
           keep
           (setq p (if (= word-end (point-max))
               nil ;; Current word is at e-o-b. No forward search
             (if (re-search-forward word-re bound t)
                 ;; word-re match ends one char after word
                 (progn (backward-char) (point) )
               ;; Check above does not match similar word at e-o-b
               (goto-char (point-max))
               (setq keep nil) ;; Ensure this is last iteration
               (search-backward word (- (point-max)
                            (length word)) t)))))
    (let ((lw (flyspell-get-word)))
      (if (and (consp lw) (string-equal (car lw) word))
          (setq r p)
        (goto-char (1+ p)))))
      r)))
;; --------------------------- 8< ----------------------------------------------------------

Regards,

--
Agustin


reply via email to

[Prev in Thread] Current Thread [Next in Thread]