bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep as a library?


From: Stepan Kasal
Subject: Re: grep as a library?
Date: Wed, 15 Jan 2003 07:42:50 +0100
User-agent: Mutt/1.2.5.1i

Hello,

On Mon, Jan 13, 2003 at 01:48:20PM +0100, Bruno Haible wrote:
> It would be useful to have the main functionality of the "grep" program
> in a GPLed library. The GNU gettext program "msggrep" needs to call grep
> hundreds of times for a given 100 KB input file, and users complain that
> it is slow. It would be faster if msggrep could use a library. It is

I gather you read the same file multiple times, perhaps with various
regexps.  Couldn't a neat sed script do this best?

Or, more generally, couldn't you prepare a self-contained example (I'm
not familiar with gettext and making translations) containing a shell
prototype of msggrep, put it somewhere and then post a cry "who is able
to improve this, either by pure sh methods, or by re-using some code
from grep and/or regexp libs?" ?

Back to grep as shared lib:
well, what does grep consists of?

1) open file (perhaps go recursively through a directory tree)
2) get the regexp(s)
3) optimize the search if there are suitable substrings
        (refer to grep/src/kwset.c)
4) search the file
5) present output (possibly with context lines)

What do you want reuse?  I gather you read the same file multiple times,
so there is no need for re-open.

ad 3): is the kwset algorithm appropriate here?  If yes, wouldn't it be
better to make it part of the regex library (the regex.c from glibc
or pcre)?  A minimalist version would be that one has to explicitly
ask for kwset.  But it can be possible do define some heuristics which
would be good enough to decide whether kwset should be used in most cases.

BTW: I know that the current regex.c is slow but I'm still not sure it
cannot be improved.  And it seems that its semantics is very good, which
is more important, isn't it?
(Or you can use Tom Lord's hackerlib, if you want.)

ad 5): it's not clear whether you want to present the results yourself
or whether you want the grep lib to print it to stdout.

Regards,
        Stepan Kasal




reply via email to

[Prev in Thread] Current Thread [Next in Thread]