help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Most used words in current buffer


From: tomas
Subject: Re: Most used words in current buffer
Date: Sun, 22 Jul 2018 11:02:42 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Jul 21, 2018 at 02:22:44PM -0400, Stefan Monnier wrote:
> > (defun buffer-most-used-words-2 (n)
> >   "Make a list of the N most used words in buffer."
> >   (let ((counts (avl-tree-create (lambda (wc1 wc2)
> >                                (string< (first wc1) (first wc2)))))
> >     (words (split-string (buffer-string)))
> 
> If you want to go fast, don't use split-string+buffer-string.
> Scan through the buffer and extract each word with buffer-substring directly.
> 
> >       (let ((element (avl-tree-member counts (list (downcase word) 0))))
> 
> I'd use a hash-table (implemented in C) rather than an avl-tree
> (implemented in Elisp).

Plus, a (well-implemented) hash table will always be faster (for inserts
and random lookups) than a balanced (AVL, red-black) binary tree. The
latter affords you sorted lookup (find first greater than, output in order).

You pay for that :-)

Cheers
- -- t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAltUSDIACgkQBcgs9XrR2kYypwCcC7wGis1R7N6HnC5Moq0yj1Hb
a7AAniF3qrn9Tu60jo8qhkQRM73KMFDS
=eGJK
-----END PGP SIGNATURE-----



reply via email to

[Prev in Thread] Current Thread [Next in Thread]