[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Ifile-discuss] Re: Saving ifile database source files
From: |
Joe Kelsey |
Subject: |
[Ifile-discuss] Re: Saving ifile database source files |
Date: |
Sun, 31 Aug 2003 11:06:37 -0700 |
On Sat, 2003-08-30 at 05:54, Clemens Fischer wrote:
> * Joe Kelsey:
>
> > Currently, I plan to delete old database files to keep the directory
> > sizes under control.
>
> you don't have to do this: ifile keeps only so many words in its
> database. for this it has a stoplist and throws out rarely used
> words. back when i used ifile for spam/non-spam cassification, my
> database never grew beyond a few hundred kilobytes and i never had to
> trim it.
AFAIK, ifile keeps a single .ifile database. When I say "database
source file" I mean what you call a spam corpus. Why do I need a
complete spam corpus? I think a week or two of spam sufficies to
provide 99% classification, as long as I include facilities for moving
mis-calssified objects back and forth.
You still have not addressed the need for a so-called "spam corpus". I
believe that these caches of spam emails serve no useful purpose that
cannot be achieved by a one or two week cache.
/Joe