aspell-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [aspell-devel] remove from word lists


From: Jose Da Silva
Subject: Re: [aspell-devel] remove from word lists
Date: Sat, 19 Feb 2005 12:58:19 -0800
User-agent: KMail/1.6.1

On Wednesday 16 February 2005 02:48 pm, Kevin Atkinson wrote:
> On Wed, 16 Feb 2005, Anton Leuski wrote:
> > I guess it's not possible to remove words from the personal and session
> > word lists, right? When I try the remove method   on a personal word
> > list from a Speller instance it comes back with
> >
> > The method "remove" is unimplemented in "WritableDict"
> >
> > Or am I missing something? Can you give me any advice on how implement
> > the remove method? Or (even better :-)) when is it going to implemented
> > in the main code base?
>
> The problem is that when Aspell "saves" a personal word list it doesn't
> really "save" it. Instead it merges the in-memory word list with the one
> saved to disk.  That is before saving it reread the on-disk word list and
> than add any new words found to the in-memory word list.  I do this
> avoid the problem of multiple Aspell processes, running at the same
> time, clobbering each others changes.  This means that deleting a word
> from on in-memory word list will have no effect if word is also in the
> on-disk word list.  A truly correct solution to this problem will be
> rather complicated.  I am willing to accept a simpler, yet not 100%
> correct, solution but I have not got around to implementing it.

After reading through this, it makes more sense what's happening within 
Aspell, but looking at Aspell, it does appear like one huge ball of tangled 
yarn, so it is fairly difficult to find a place to start without unravelling 
a bunch of other items.
Word hashing seems to have advantages, such as fewer words to search through, 
probably a smaller memory footprint, but somehow it would seem worthwhile to 
have word "ownership" thrown into the hash so you know where what word came 
from. For example, family of users all using speller all at same time but 
different requirements (eg, German, english, french, etc...), or perhaps a 
library type of setting with multi-head terminals and one user decides to 
use some foreign language versus the other X users. Or perhaps, one user 
using multiple applications, say a word processor while at same time 
spell-checking some stuff in another language. Or as Anton Leuski suggests, 
how can you delete a word?

Just throwing some thoughts around.... so if there are better suggestions, 
you're welcome to reply.

Perhaps adding word ownership to each word would probably help track where 
what word came from, so you could track languages or be able to delete 
words, but adding flags for each word would probably increase the memory 
footprint a substantial amount versus having just 1 flag to mark 1 file.

Suppose there is a "primary" dictionary kept somewhat intact... atleast you 
know who owns that, then you hash a 2nd personal dictionary against the 
primary but keep it in it's own thread (so you know "word ownership" there 
too), this way you keep the memory footprint down and reduce the wordcount 
that way too. Multi-languages could probably follow something along this 
idea.... atleast you won't have to reset aspell as "harshly" if you are 
switching languages or running multi-languages all at same time.
...basically, the ideas here are thinking in terms of introducing 
multi-threading.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]