ifile-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ifile-discuss] Adding a "plugin" parser to ifile


From: Booker Bense
Subject: Re: [Ifile-discuss] Adding a "plugin" parser to ifile
Date: Thu, 06 Mar 2003 15:08:44 -0800 (PST)

On Thu, 6 Mar 2003, Booker Bense wrote:

> On Thu, 6 Mar 2003, Jason Rennie wrote:
>
> >
> > address@hidden said:
> > > - I've been messing about with various ifile options "-h -w -k"[1] and
> > > I'm coming to the conclusion that it would be quite useful to
> > > customize the parser for ifile. While you can do this in front of
> > > ifile, I think as a first pass it would be convient to provide a
> > > "subprocess parser".
> >
> > Let me propose something here.  How about we break ifile into two parts:
> > the parsing/lexing part and the classifier.  To run ifile, you'd give the
> > message to the first program and pipe the output to the second program.
> > The first program could output one token per line.  Then the lexing part
> > of the classifier would be exceedingly simple.  Breaking it apart like
> > this would make it easier for people to experiment with different parsing/
> > lexing styles; it would even make it possible to write prototype parsers/
> > lexers in perl or some language that's easier to write.
> >
> > Thoughts?
> >
>
> - I think that would be great! After looking at the lexing code
> at bit I realize it's not quite as simple as I thought to plug
> in the kind of lexer/parser I'm thinking about. If there's
> anything I can do to help let me know.
>
> _ It seems to me that you could turn the current ifile into the
> second program quite trivially, and the first program would
> essentially be ifile_email_parser ? Perhaps a good first step
> in this direction would be just adding a "raw" parse mode to
> the current ifile.
>
> ifile -r
>
> ifile expects the tokens to appear one per line and does no
> special processing. Should be used with some preprocessor.
>
> Actually, this could already be done with the -w option. Hmm,
> this has given me some food for thought.
>

- I've been thinking some more and it seems clearer to me that
you probably want to index everything, but you probably don't
want to query on everything. Let's take the example of email from
my wife. I keep in a folder with other email from non-work
friends, clearly I want to index all the words in those messages
in case some joke comes in from somebody I haven't heard from in
a while. However, when a message comes in from my wife, the only
text I want to query against is my wife's address. In fact I can
get the exact behaviour I want by putting a preprocesser in front of
the current ifile.

echo "address@hidden" | ifile -w -q

does exactly what I want. In fact it can easily duplicate all the
current prefiltering I do before I resort to ifile.


This leads me to think of something like the following:

Run ifile -w -q on the From address.

Run ifile -w -q on the header.

Run ifile -w -q on the body.

Maybe this is obvious to everybody, but until today it hadn't
sunk in that I could feed any old text I wanted to ifile. It
doesn't have to look anything like an email message. Ifile
is a general purpose text sorting tool with a few tweaks for
email.

- Booker C. Bense




reply via email to

[Prev in Thread] Current Thread [Next in Thread]