[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: grep as a library?
From: |
Bruno Haible |
Subject: |
Re: grep as a library? |
Date: |
Wed, 15 Jan 2003 15:14:17 +0100 (CET) |
Stepan Kasal writes:
> I gather you read the same file multiple times, perhaps with various
> regexps.
Not exactly. I need to process 200 small pieces of text, available as
a single string in memory or as a list of lines in memory, through
grep, with each time the same command line options (multiple basic
regexps and multiple extended regexps and multiple fixed strings, in
the worst case).
> Couldn't a neat sed script do this best?
Then I would have to translate the 'grep' arguments to a sed script -
probably doable - and call 'sed' 200 times - which will not be much
faster than calling 'grep' 200 times.
> Back to grep as shared lib:
> well, what does grep consists of?
>
> 1) open file (perhaps go recursively through a directory tree)
> 2) get the regexp(s)
> 3) optimize the search if there are suitable substrings
> (refer to grep/src/kwset.c)
> 4) search the file
> 5) present output (possibly with context lines)
>
> What do you want reuse?
I want to reuse 3) and 4), using input from memory instead of a file.
> ad 3): is the kwset algorithm appropriate here? If yes, wouldn't it be
> better to make it part of the regex library (the regex.c from glibc
> or pcre)?
I mean to have the whole regex + dfa + kwset thing in one library.
Bruno