aspell-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [aspell-devel] Affix compression


From: Kevin Atkinson
Subject: Re: [aspell-devel] Affix compression
Date: Sat, 23 Jul 2005 18:24:57 -0600 (MDT)

On Sat, 23 Jul 2005, Gregory Maxwell wrote:

> I was wondering if anyone has looked at using libjudy
> (http://judy.sourceforge.net/) for storing words with aspell?

No I have not.  And it is unlikely I will unless someone with a deep 
understanding of how Aspell work's internally (hint study readonly_ws.cpp 
and suggest.cpp) convince me it will improve on the exiting 
implementation.

> Libjudy provides a number of sparse array data structures which
> provide very fast lookups, because they are cache aware, and
> reasonable memory efficiency. There is a function in the libjudy
> package that provides a string indexed array which is quite space
> efficient because it is prefix compressed.
> 
> I don't have a standalone metaphone encoder handy, but just passing
> /usr/dict/words on my pentium M laptop into judy sl gives  0.304
> uS/word lookups using only 10mbyte of core, which is  only 2x the size
> of the file.

These numbers are meaningless to me.  In need a comparison with Aspell's 
exiting implementation.

> It would be easy to provide code with this datastructure which quickly
> found the longest match and all other entries of the same match
> length, perhaps something which would be useful in aspell as well...

Maybe.  Since you thought of the idea the burden of proof is on you.  I 
need concrete examples of how using Judy will improve on Aspell's 
existing behavior.

-- 
http://kevin.atkinson.dhs.org





reply via email to

[Prev in Thread] Current Thread [Next in Thread]