ifile-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ifile-discuss] Effect of widely differing volumes on ifile classifi


From: Jack Bertram
Subject: Re: [Ifile-discuss] Effect of widely differing volumes on ifile classification
Date: Fri, 21 Mar 2003 14:31:53 +0000
User-agent: Mutt/1.4i

* Brett Nemeroff <address@hidden> [030320 18:43]:
> I know this is slightly off topic, but I was wondering if you had custom
> scripts you could share that would report accuracy as you have in this
> post?

I have modified Martin's scripts ifile.inject-learn.header and
ifile.relearn.message to create a new empty temporary file in two
different directories every time they are called.

So, for example, in ifile.inject-learn.header, the last line is

mktemp /home/jack/.ifile.stats/learned/lXXXXXX >/dev/null 2>/dev/null

If I didn't do anything else, this would give me two directories one
which contained a file for every message, and one which contained a file
for every incorrect message.  Doing a ls | wc -l would therefore be
enough to generate the list below on a total basis.

What I actually do is to use a script ifile.stats.count which is run
nightly after refile.learn, which counts up the messages from the
previous day and stores them in a couple of files, one of which keeps
the current total, and one of which keeps a record of historic totals.
The script emails me a short summary of success rates over the last day
and since records began.  I don't keep track of statistics for an
individual folder - although I could do by counting the number of
X-Ifile-Learned-To headers and subtracting from the number of emails.

I attach the script below for your reference.

The output in my previous email was just a Perl one-liner which took the
contents of the last 30 days of each of the historic record files and
processed them.

jack

---- historic record file: example for "relearned" mail ----

Wed Mar 12 06:00:01 GMT 2003 : 837
Thu Mar 13 06:00:01 GMT 2003 : 841
Fri Mar 14 06:00:00 GMT 2003 : 841
Sat Mar 15 06:00:00 GMT 2003 : 848
Sun Mar 16 06:00:00 GMT 2003 : 850
Mon Mar 17 06:00:01 GMT 2003 : 851
Tue Mar 18 06:00:00 GMT 2003 : 855
Wed Mar 19 06:00:01 GMT 2003 : 858
Thu Mar 20 06:00:00 GMT 2003 : 862
Fri Mar 21 06:00:00 GMT 2003 : 863


---- ifile.stats.count ----

#!/bin/bash

DATE=`date`

STATS_DIR="$HOME/.ifile.stats"
LEARN_DIR="$STATS_DIR/learned"
RELEARN_DIR="$STATS_DIR/relearned"
L_CURRENT_FILE="$STATS_DIR/l_current"
R_CURRENT_FILE="$STATS_DIR/r_current"
L_ALL_FILE="$STATS_DIR/l_all"
R_ALL_FILE="$STATS_DIR/r_all"


L_CURRENT=`cat $L_CURRENT_FILE`
R_CURRENT=`cat $R_CURRENT_FILE`

LEARN=0
RELEARN=0

for file in $(ls $LEARN_DIR)
do
        LEARN=$(($LEARN+1))
        rm $LEARN_DIR/$file
done

for file in $(ls $RELEARN_DIR)
do
        RELEARN=$(($RELEARN+1))
        rm $RELEARN_DIR/$file
done

SUCCESS_TODAY=`echo "scale=2; 100-(100*$RELEARN/$LEARN)" | bc -l`

L_CURRENT=$(($L_CURRENT+$LEARN))
R_CURRENT=$(($R_CURRENT+$RELEARN))
SUCCESS_CURRENT=`echo "scale=2; 100-(100*$R_CURRENT/$L_CURRENT)" | bc -l`


printf $L_CURRENT > $L_CURRENT_FILE 
printf $R_CURRENT > $R_CURRENT_FILE

printf "$DATE : $L_CURRENT\n" >> $L_ALL_FILE 
printf "$DATE : $R_CURRENT\n" >> $R_ALL_FILE

printf "Ifile statistics: $DATE\n"
printf "~~~~~~~~~~~~~~~~~\n"
printf "\n"
printf "Last day:"
printf          "     Received:   $LEARN\n"
printf "              Relearned:  $RELEARN\n"
printf "              Success:    $SUCCESS_TODAY%%\n"
printf "\n"
printf "Total:"
printf       "        Received:   $L_CURRENT\n"
printf "              Relearned:  $R_CURRENT\n"
printf "              Success:    $SUCCESS_CURRENT%%\n"






reply via email to

[Prev in Thread] Current Thread [Next in Thread]