bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep as a library?


From: Bruno Haible
Subject: Re: grep as a library?
Date: Wed, 15 Jan 2003 15:14:17 +0100 (CET)

Stepan Kasal writes:

> I gather you read the same file multiple times, perhaps with various
> regexps.

Not exactly. I need to process 200 small pieces of text, available as
a single string in memory or as a list of lines in memory, through
grep, with each time the same command line options (multiple basic
regexps and multiple extended regexps and multiple fixed strings, in
the worst case).

> Couldn't a neat sed script do this best?

Then I would have to translate the 'grep' arguments to a sed script -
probably doable - and call 'sed' 200 times - which will not be much
faster than calling 'grep' 200 times.

> Back to grep as shared lib:
> well, what does grep consists of?
> 
> 1) open file (perhaps go recursively through a directory tree)
> 2) get the regexp(s)
> 3) optimize the search if there are suitable substrings
>       (refer to grep/src/kwset.c)
> 4) search the file
> 5) present output (possibly with context lines)
> 
> What do you want reuse?

I want to reuse 3) and 4), using input from memory instead of a file.

> ad 3): is the kwset algorithm appropriate here?  If yes, wouldn't it be
> better to make it part of the regex library (the regex.c from glibc
> or pcre)?

I mean to have the whole regex + dfa + kwset thing in one library.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]