[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH] regexp documentation additional examples (scheme-data.texi)
From: |
Ian Sheldon |
Subject: |
[PATCH] regexp documentation additional examples (scheme-data.texi) |
Date: |
Sun, 22 Sep 2002 22:09:10 +0100 |
User-agent: |
Gnus/5.090005 (Oort Gnus v0.05) XEmacs/21.4 (Common Lisp, i386-mandrake-linux) |
Below, I've added a patch for a few examples of using the regular
expression functions. I hope it is useful.
(guile contribution papers already signed).
Ian.
* slib.texi: Fixed double `the' in sentence.
* scheme-data.texi: Added some examples for regular expression usage.
Index: slib.texi
===================================================================
RCS file: /cvsroot/guile/guile/guile-core/doc/ref/slib.texi,v
retrieving revision 1.3
diff -u -r1.3 slib.texi
--- slib.texi 8 Jan 2002 08:29:00 -0000 1.3
+++ slib.texi 22 Sep 2002 20:55:07 -0000
@@ -2,7 +2,7 @@
@node SLIB
@chapter SLIB
-Before the the SLIB facilities can be used, the following Scheme
+Before the SLIB facilities can be used, the following Scheme
expression must be executed:
@smalllisp
Index: scheme-data.texi
===================================================================
RCS file: /cvsroot/guile/guile/guile-core/doc/ref/scheme-data.texi,v
retrieving revision 1.22
diff -u -r1.22 scheme-data.texi
--- scheme-data.texi 16 Sep 2002 20:01:34 -0000 1.22
+++ scheme-data.texi 22 Sep 2002 20:55:22 -0000
@@ -1901,6 +1901,11 @@
@deffnx {C Function} scm_string_append (args)
Return a newly allocated string whose characters form the
concatenation of the given strings, @var{args}.
address@hidden
+(define h "hello ")
+(string-append h "world")
address@hidden "hello world"
address@hidden lisp
@end deffn
@@ -1947,6 +1952,13 @@
implemented by SCSH, the Scheme Shell. It is intended to be
upwardly compatible with SCSH regular expressions.
+Before the regular expression facilities can be used in a script,
+the following expression must be executed:
+
address@hidden
+(use-modules (ice-9 regex))
address@hidden lisp
+
@c begin (scm-doc-string "regex.scm" "string-match")
@deffn {Scheme Procedure} string-match pattern str [start]
Compile the string @var{pattern} into a regular expression and compare
@@ -1959,6 +1971,18 @@
@var{pattern} at all, @code{string-match} returns @code{#f}.
@end deffn
+Two examples of a match are given below.
+The first example matches the four digits in the string,
+and the second example matches nothing.
+
address@hidden
+(string-match "[0-9][0-9][0-9][0-9]" "blah2002")
address@hidden #("blah2002" (4 . 8))
+
+(string-match "[A-Za-z]" "123456")
address@hidden #f
address@hidden lisp
+
Each time @code{string-match} is called, it must compile its
@var{pattern} argument into a regular expression structure. This
operation is expensive, which makes @code{string-match} inefficient if
@@ -2030,6 +2054,23 @@
@end table
@end deffn
address@hidden
+;; Regexp to match uppercase letters
+(define r (make-regexp "[A-Z]*"))
+
+;; Regexp to match letters, ignoring case
+(define ri (make-regexp "[A-Z]*" regexp/icase))
+
+;; Search for bob using regexp r
+(match:substring (regexp-exec r "bob"))
address@hidden "" (no match)
+
+;; Search for bob using regexp ri
+(match:substring (regexp-exec ri "Bob"))
address@hidden "Bob" (matched case insensitive)
address@hidden lisp
+
+
@deffn {Scheme Procedure} regexp? obj
@deffnx {C Function} scm_regexp_p (obj)
Return @code{#t} if @var{obj} is a compiled regular expression,
@@ -2061,9 +2102,25 @@
the regexp match is written.
@end itemize
address@hidden may be @code{#f}, in which case nothing is written; instead,
+The @var{port} argument may be @code{#f}, in which case
+nothing is written; instead,
@code{regexp-substitute} constructs a string from the specified
@var{item}s and returns that.
+
+The following example take a regular expression
+that matches a standard YYYYMMDD date such as 20020828.
+The @code{regexp-substitute} then returns a string from the
+match structure containing the fields and text from
+the original string re-ordered and split out.
+
address@hidden
+(define datere "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
+(define s "Date 20020429 12am.")
+(define sm (string-match datere s))
+(regexp-substitute #f sm 'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
address@hidden "Date 04-29-2002 12am. (20020429)"
address@hidden lisp
+
@end deffn
@c begin (scm-doc-string "regex.scm" "regexp-substitute")
@@ -2090,6 +2147,17 @@
present among the @var{item}s, then @code{regexp-substitute/global} will
return after processing a single match.
@end itemize
+
+So, the example above for @code{regexp-substitute} could be re-written
+to remove the @code{string-match} stage.
+
address@hidden
+(define datere "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
+(define s "Date 20020429 12am.")
+(regexp-substitute/global #f datere s
+ 'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
address@hidden "Date 04-29-2002 12am. (20020429)"
address@hidden lisp
@end deffn
@node Match Structures
@@ -2124,26 +2192,68 @@
@var{n}. Submatch 0 (the default) represents the entire regexp match.
If the regular expression as a whole matched, but the subexpression
number @var{n} did not match, return @code{#f}.
+
address@hidden
+(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
+(match:substring s)
address@hidden "2002"
+
+;; match starting at offset 6 in the string
+(match:substring
+ (string-match "[0-9][0-9][0-9][0-9]" "blah987654" 6))
address@hidden "7654"
address@hidden lisp
+
@end deffn
@c begin (scm-doc-string "regex.scm" "match:start")
@deffn {Scheme Procedure} match:start match [n]
Return the starting position of submatch number @var{n}.
+
+In the following example, the result is four since the
+match started at character index four.
+
address@hidden
+(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
+(match:start s)
address@hidden 4
address@hidden lisp
@end deffn
@c begin (scm-doc-string "regex.scm" "match:end")
@deffn {Scheme Procedure} match:end match [n]
Return the ending position of submatch number @var{n}.
+
+In the following example, the result is eight since the match
+is between characters four and eight (i.e., the 2002).
+
address@hidden
+(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
+(match:end s)
address@hidden 8
address@hidden lisp
@end deffn
@c begin (scm-doc-string "regex.scm" "match:prefix")
@deffn {Scheme Procedure} match:prefix match
Return the unmatched portion of @var{target} preceding the regexp match.
+
address@hidden
+(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
+(match:prefix s)
address@hidden "blah"
address@hidden lisp
@end deffn
@c begin (scm-doc-string "regex.scm" "match:suffix")
@deffn {Scheme Procedure} match:suffix match
Return the unmatched portion of @var{target} following the regexp match.
+
address@hidden
+(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
+(match:suffix s)
address@hidden "foo"
address@hidden lisp
@end deffn
@c begin (scm-doc-string "regex.scm" "match:count")
@@ -2156,6 +2266,12 @@
@c begin (scm-doc-string "regex.scm" "match:string")
@deffn {Scheme Procedure} match:string match
Return the original @var{target} string.
+
address@hidden
+(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
+(match:string s)
address@hidden "blah2002foo"
address@hidden lisp
@end deffn
@node Backslash Escapes
- [PATCH] regexp documentation additional examples (scheme-data.texi),
Ian Sheldon <=