[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Stupid module and pregexp questions
From: |
Tom Lord |
Subject: |
Re: Stupid module and pregexp questions |
Date: |
Mon, 5 May 2003 13:19:55 -0700 (PDT) |
> Thanks again for the very interesting comments.
Hope they're useful. Along those lines:
> With respect (ice-9 regex), I'm inclined to agree with you.
> If we can include a good POSIX implementation, then that
> should fix the problems I've been asking about.
State of the world, as I know it:
* GNU regex
Most versions are buggy. RMS did some work on one fork (perhaps just
the one in emacs) and I'm pretty sure he fixed all the Posix bugs.
So, ask in that direction.
Advantages: simple, small, LGPL
Disadvantages: slow on expressions that cause backtracking
Uknowns: Posix compliant? (guess bias towards "yes" for the latest
from RMS)
* Henry Spencer's
I haven't seen any more recent release than the one included in
Tcl.
Advantages: fast, Berkeley-ish license (GPL compatible), Unicode support
Disadvantages: big and complicated. Some Posix bugs (at least
ca. mid-2002)
Uknowns: still maintained?
* Isamu Hasegawa's (Latest glibc?)
Advantages: smart implementor, DFAish, in glibc (so perhaps gets
beaten upon), LGPL
Disadvantages: odd space requirements, big and complicated
Unknowns: Posix conformance status (guess bias: "good") and
performance (guess bias: "good for short strings")
* Tom Lord's (latest arch, src/hackerlab)
Advantages: DFAish, fast, good correctness tests, Unicode in
low-level engine (but not (yet) via Posix entry points), good growth path
basis for "what should a regexp srfi do".
Disadvantages: big and complicated, GPL (probably flexible on that),
regcomp is mildly slow (compared to GNU regex but regexec fast), can
be a fickle beast to tune (but conversely: flexibly tunable).
Unknowns: maintained? (c.f., my so-called life :-)
* Others
Don't bother, imho.
It's mostly the "big and complicated" on all but one of those that
makes me suggest bundling a good fork of GNU regex, if you can get
one.
-t