dotgnu-pnet-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Dotgnu-pnet-commits] CVS: pnetlib/System/Text/RegularExpressions arch.


From: Gopal.V <address@hidden>
Subject: [Dotgnu-pnet-commits] CVS: pnetlib/System/Text/RegularExpressions arch.cs,NONE,1.1 cache.cs,NONE,1.1 category.cs,NONE,1.1 collections.cs,NONE,1.1 compiler.cs,NONE,1.1 debug.cs,NONE,1.1 interpreter.cs,NONE,1.1 interval.cs,NONE,1.1 match.cs,NONE,1.1 parser.cs,NONE,1.1 quicksearch.cs,NONE,1.1 regex.cs,NONE,1.1 RegexRunner.cs,NONE,1.1 RegexRunnerFactory.cs,NONE,1.1 replace.cs,NONE,1.1 syntax.cs,NONE,1.1 ChangeLog,NONE,1.1 notes.txt,NONE,1.1README,NONE,1.1
Date: Tue, 17 Dec 2002 10:50:14 -0500

Update of /cvsroot/dotgnu-pnet/pnetlib/System/Text/RegularExpressions
In directory subversions:/tmp/cvs-serv21851/System/Text/RegularExpressions

Added Files:
        arch.cs cache.cs category.cs collections.cs compiler.cs 
        debug.cs interpreter.cs interval.cs match.cs parser.cs 
        quicksearch.cs regex.cs RegexRunner.cs RegexRunnerFactory.cs 
        replace.cs syntax.cs ChangeLog notes.txt README 
Log Message:
Import the System.Text.RegularExpressions namespace from Mono libraries
into pnetlib and fulfil general dependencies.


--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        arch.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Collections;

namespace System.Text.RegularExpressions {

        enum OpCode : ushort {
                False           = 0,    // always fails
                True,                   // always succeeds

                // matching

                Position,               // zero-width position assertion
                String,                 // match string literal
                Reference,              // back reference

                // character matching

                Character,              // match character exactly
                Category,               // match character from category
                Range,                  // match character from range
                Set,                    // match character from set
                In,                     // match character from group of tests

                // capturing

                Open,                   // open group
                Close,                  // close group
                Balance,                // balance groups

                // control flow

                IfDefined,              // conditional on capture
                Sub,                    // non-backtracking subexpression
                Test,                   // non-backtracking lookahead/behind
                Branch,                 // alternative expression
                Jump,                   // unconditional goto
                Repeat,                 // new repeat context
                Until,                  // repeat subexpression within context
                FastRepeat,             // repeat simple subexpression
                Anchor,                 // anchoring expression

                // miscellaneous
                
                Info                    // pattern information
        }

        [Flags]
        enum OpFlags : ushort {
                None            = 0x000,
                Negate          = 0x100,        // succeed on mismatch
                IgnoreCase      = 0x200,        // case insensitive matching
                RightToLeft     = 0x400,        // right-to-left matching
                Lazy            = 0x800         // minimizing repeat
        }

        enum Position : ushort {
                Any,                    // anywhere
                Start,                  // start of string                      
\A
                StartOfString,          // start of string                      
\A
                StartOfLine,            // start of line                        
^
                StartOfScan,            // start of scan                        
\G
                End,                    // end or before newline at end         
\Z
                EndOfString,            // end of string                        
\z
                EndOfLine,              // end of line                          
$
                Boundary,               // word boundary                        
\b
                NonBoundary             // not word boundary                    
\B
        };
        
        // see category.cs for Category enum

        interface IMachine {
                Match Scan (Regex regex, string text, int start, int end);
        }

        interface IMachineFactory {
                IMachine NewInstance ();
                IDictionary Mapping { get; set; }
                int GroupCount { get; }
        }

        // Anchor SKIP OFFSET
        //
        // Flags:       [RightToLeft] ??
        // SKIP:        relative address of tail expression
        // OFFSET:      offset of anchor from start of pattern
        //
        // Usage:
        //
        //      Anchor :1 OFFSET
        //              <expr>
        //              True
        // 1:   <tail>
        //
        // Notes:
        //
        // In practice, the anchoring expression is only going to be
        // Position (StartOfString, StartOfLine, StartOfScan) or String.
        // This is because the optimizer looks for position anchors at the
        // start of the expression, and if that fails it looks for the
        // longest substring. If an expression has neither a position
        // anchor or a longest substring anchor, then the anchoring expression
        // is left empty. Since an empty expression will anchor at any
        // position in any string, the entire input string will be scanned.

        // String LEN STR...
        //
        // Flags:       [RightToLeft, IgnoreCase]
        // LEN:         length of string
        // STR:         string characters

        // Branch SKIP
        //
        // SKIP:        relative address of next branch
        //
        //      Branch :1
        //              <alt expr 1>
        //              Jump :4
        // 1:   Branch :2
        //              <alt expr 2>
        //              Jump :4
        // 2:   Branch :3
        //              <alt expr 3>
        //              Jump :4
        // 3:   False
        // 4:   <tail>

        // Repeat SKIP MIN MAX
        //
        // Flags:       [Lazy]
        // SKIP:        relative address of Until instruction
        // MIN:         minimum iterations
        // MAX:         maximum iterations (0xffff is infinity)
        //
        //      Repeat :1 MIN MAX
        //              <expr>
        //              Until
        // 1:   <tail>

        // FastRepeat SKIP MIN MAX
        //
        // Flags:       [Lazy]
        // SKIP:        relative address of tail expression
        // MIN:         minimum iterations
        // MAX:         maximum iterations (0xffff is infinity)
        //
        //      FastRepeat :1 MIN MAX
        //              <expr>
        //              True
        // 1:   <tail>
        //
        // Notes:
        //
        // The subexpression of a FastRepeat construct must not contain any
        // complex operators. These include: Open, Close, Balance, Repeat,
        // FastRepeat, Sub, Test. In addition, the subexpression must have
        // been determined to have a fixed width.
        
        // Sub SKIP
        //
        // SKIP:        relative address of tail expression
        //
        //      Sub :1
        //              <expr>
        // 1:   <tail>
        //
        // Notes:
        //
        // The Sub operator invokes an independent subexpression. This means
        // that the subexpression will match only once and so will not
        // participate in any backtracking.

        // Test TSKIP FSKIP
        //
        // TSKIP:       relative address of true expression
        // FSKIP:       relative address of false expression
        //
        // Usage:       (?(?=test)true|false)
        //
        //      Test :1 :2
        //              <test expr>
        // 1:           <true expr>
        //              Jump
        // 2:           <false epxr>
        //      <tail>
        //
        // Usage:       (?(?=test)true)
        //
        //      Test :1 :2
        //              <test expr>
        // 1:           <true expr>
        // 2:   <tail>
        //
        // Usage:       (?=test)
        //
        //      Test :1 :2
        //              <test expr>
        // 1:           <true expr>
        //              Jump 3:
        // 2:           False
        // 3:           <tail>
        //
        // Notes:
        //
        // For negative lookaheads, just swap the values of TSKIP and
        // FSKIP. For lookbehinds, the test expression must be compiled
        // in reverse. The test expression is always executed as an
        // independent subexpression, so its behaviour is non-backtracking
        // (like a Sub clause.)

        // IfDefined SKIP GID
        //
        // SKIP:        relative address of else expression
        // GID:         number of group to check
        //
        // Usage:       (?(gid)true)
        //
        //      IfDefined :1
        //              <true expr>
        // 1:   <tail>
        //
        // Usage:       (?(gid)true|false)
        //
        //      IfDefined :1
        //              <true expr>
        //              Jump :2
        // 1:           <false expr>
        // 2:   <tail>

        // Jump SKIP
        //
        // SKIP:        relative address of target expression
        //
        //      Jump :1
        //      ...
        // :1   <target expr>

        // Character CHAR
        //
        // Flags:       [Negate, IgnoreCase, RightToLeft]
        // CHAR:        exact character to match

        // Category CAT
        //
        // Flags:       [Negate, RightToLeft]
        // CAT:         category to match (see Category enum)

        // Range LO HI
        //
        // Flags:       [Negate, IgnoreCase, RightToLeft]
        // LO:          lowest character in range
        // HI:          higest character in range

        // Set LO LEN SET...
        //
        // Flags:       [Negate, IgnoreCase, RightToLeft]
        // LO:          lowest character in set
        // LEN:         number of words in set
        // SET:         bit array representing characters in set
        //
        // Notes:
        //
        // Each word in the set represents 16 characters, so the first word
        // defines membership for characters LO to LO + 15, the second for
        // LO + 16 to LO + 31, and so on up to LO + (LEN * 16 - 1). It is
        // up to the compiler to provide a compact representation for sparse
        // unicode sets. The simple way is to use Set 0 4096. Other methods
        // involve paritioning the set and placing the components into an
        // In block.

        // In SKIP
        //
        // SKIP:        relative address of tail expression
        //
        // Usage:       [expr]
        //
        //      In :1
        //              <expr>
        //              True
        // :1   <tail>
        //
        // Usage:       [^expr]
        //
        //      In :1
        //              <expr>
        //              False
        // :1   <tail>
        //
        // Notes:
        //
        // The In instruction consumes a single character, using the flags
        // of the first instruction in the subexpression to determine its
        // IgnoreCase and RightToLeft properties. The subexpression is then
        // applied to the single character as a disjunction. If any instruction
        // in the subexpression succeeds, the entire In construct succeeds
        // and matching continues with the tail.

        // Position POS
        //
        // POS:         position to match (see Position enum)

        // Open GID
        //
        // GID:         number of group to open

        // Close GID
        //
        // GID:         number of group to close
        
        // Balance GID BAL
        //
        // GID:         number of capturing group (0 if none)
        // BAL:         number of group to undefine

        // Info GROUPS MIN MAX
        //
        // GROUPS:      number of capturing groups
        // MIN:         minimum width of pattern
        // MAX:         maximum width of pattern (0xffff means undefined)

        // False

        // True

        // Reference GID
        //
        // Flags:       [IgnoreCase, RightToLeft]
        // GID:         number of group to reference
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        cache.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Collections;

namespace System.Text.RegularExpressions {

        class FactoryCache {
                public FactoryCache (int capacity) {
                        this.capacity = capacity;
                        this.factories = new Hashtable (capacity);
                        this.mru_list = new MRUList ();
                }

                public void Add (string pattern, RegexOptions options, 
IMachineFactory factory) {
                        lock (this) {
                                Key k = new Key (pattern, options);

                                while (factories.Count >= capacity) {
                                        object victim = mru_list.Evict ();
                                        if (victim != null)
                                                factories.Remove ((Key)victim);
                                }
                                
                                factories[k] = factory;
                                mru_list.Use (k);
                        }
                }

                public IMachineFactory Lookup (string pattern, RegexOptions 
options) {
                        lock (this) {
                                Key k = new Key (pattern, options);
                                if (factories.Contains (k)) {
                                        mru_list.Use (k);
                                        return (IMachineFactory)factories[k];
                                }
                        }

                        return null;
                }

                private int capacity;
                private Hashtable factories;
                private MRUList mru_list;

                class Key {
                        public string pattern;
                        public RegexOptions options;

                        public Key (string pattern, RegexOptions options) {
                                this.pattern = pattern;
                                this.options = options;
                        }
                        
                        public override int GetHashCode () {
                                return pattern.GetHashCode () ^ (int)options;
                        }

                        public override bool Equals (object o) {
                                if (o == null || !(o is Key))
                                        return false;

                                Key k = (Key)o;
                                return options == k.options && pattern.Equals 
(k.pattern);
                        }

                        public override string ToString () {
                                return "('" + pattern + "', [" + options + "])";
                        }
                }
        }

        class MRUList {
                public MRUList () {
                        head = tail = null;
                }

                public void Use (object o) {
                        Node node;

                        if (head == null) {
                                node = new Node (o);
                                head = tail = node;
                                return;
                        }

                        node = head;
                        while (node != null && !o.Equals (node.value))
                                node = node.previous;

                        if (node == null)
                                node = new Node (o);
                        else {
                                if (node == head)
                                        return;

                                if (node == tail)
                                        tail = node.next;
                                else
                                        node.previous.next = node.next;

                                node.next.previous = node.previous;
                        }

                        head.next = node;
                        node.previous = head;
                        node.next = null;
                        head = node;
                }

                public object Evict () {
                        if (tail == null)
                                return null;

                        object o = tail.value;
                        tail = tail.next;

                        if (tail == null)
                                head = null;
                        else
                                tail.previous = null;

                        return o;
                }

                private Node head, tail;

                private class Node {
                        public object value;
                        public Node previous, next;

                        public Node (object value) {
                                this.value = value;
                        }
                }
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        category.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Globalization;

namespace System.Text.RegularExpressions {

        enum Category : ushort {
                None,

                // canonical classes
        
                Any,                    // any character except newline         
.
                AnySingleline,          // any character                        
. (s option)
                Word,                   // any word character                   
\w
                Digit,                  // any digit character                  
\d
                WhiteSpace,             // any whitespace character             
\s
                
                // ECMAScript classes


                EcmaAny,
                EcmaAnySingleline,
                EcmaWord,               // [a-zA-Z_0-9]
                EcmaDigit,              // [0-9]
                EcmaWhiteSpace,         // [ \f\n\r\t\v]

                // unicode categories
                
                UnicodeL,               // Letter
                UnicodeM,               // Mark
                UnicodeN,               // Number
                UnicodeZ,               // Separator
                UnicodeP,               // Punctuation
                UnicodeS,               // Symbol
                UnicodeC,               // Other

                UnicodeLu,              // UppercaseLetter
                UnicodeLl,              // LowercaseLetter
                UnicodeLt,              // TitlecaseLetter
                UnicodeLm,              // ModifierLetter
                UnicodeLo,              // OtherLetter
                UnicodeMn,              // NonspacingMark
                UnicodeMe,              // EnclosingMark
                UnicodeMc,              // SpacingMark
                UnicodeNd,              // DecimalNumber
                UnicodeNl,              // LetterNumber
                UnicodeNo,              // OtherNumber
                UnicodeZs,              // SpaceSeparator
                UnicodeZl,              // LineSeparator
                UnicodeZp,              // ParagraphSeparator
                UnicodePd,              // DashPunctuation
                UnicodePs,              // OpenPunctuation
                UnicodePi,              // InitialPunctuation
                UnicodePe,              // ClosePunctuation
                UnicodePf,              // FinalPunctuation
                UnicodePc,              // ConnectorPunctuation
                UnicodePo,              // OtherPunctuation
                UnicodeSm,              // MathSymbol
                UnicodeSc,              // CurrencySymbol
                UnicodeSk,              // ModifierSymbol
                UnicodeSo,              // OtherSymbol
                UnicodeCc,              // Control
                UnicodeCf,              // Format
                UnicodeCo,              // PrivateUse
                UnicodeCs,              // Surrogate
                UnicodeCn,              // Unassigned

                // unicode block ranges

                // notes: the categories marked with a star are valid unicode 
block ranges,
                // but don't seem to be accepted by the MS parser using the 
/p{...} format.
                // any ideas?

                UnicodeBasicLatin,
                UnicodeLatin1Supplement,                        // *
                UnicodeLatinExtendedA,                          // *
                UnicodeLatinExtendedB,                          // *
                UnicodeIPAExtensions,
                UnicodeSpacingModifierLetters,
                UnicodeCombiningDiacriticalMarks,
                UnicodeGreek,
                UnicodeCyrillic,
                UnicodeArmenian,
                UnicodeHebrew,
                UnicodeArabic,
                UnicodeSyriac,
                UnicodeThaana,
                UnicodeDevanagari,
                UnicodeBengali,
                UnicodeGurmukhi,
                UnicodeGujarati,
                UnicodeOriya,
                UnicodeTamil,
                UnicodeTelugu,
                UnicodeKannada,
                UnicodeMalayalam,
                UnicodeSinhala,
                UnicodeThai,
                UnicodeLao,
                UnicodeTibetan,
                UnicodeMyanmar,
                UnicodeGeorgian,
                UnicodeHangulJamo,
                UnicodeEthiopic,
                UnicodeCherokee,
                UnicodeUnifiedCanadianAboriginalSyllabics,
                UnicodeOgham,
                UnicodeRunic,
                UnicodeKhmer,
                UnicodeMongolian,
                UnicodeLatinExtendedAdditional,
                UnicodeGreekExtended,
                UnicodeGeneralPunctuation,
                UnicodeSuperscriptsandSubscripts,
                UnicodeCurrencySymbols,
                UnicodeCombiningMarksforSymbols,
                UnicodeLetterlikeSymbols,
                UnicodeNumberForms,
                UnicodeArrows,
                UnicodeMathematicalOperators,
                UnicodeMiscellaneousTechnical,
                UnicodeControlPictures,
                UnicodeOpticalCharacterRecognition,
                UnicodeEnclosedAlphanumerics,
                UnicodeBoxDrawing,
                UnicodeBlockElements,
                UnicodeGeometricShapes,
                UnicodeMiscellaneousSymbols,
                UnicodeDingbats,
                UnicodeBraillePatterns,
                UnicodeCJKRadicalsSupplement,
                UnicodeKangxiRadicals,
                UnicodeIdeographicDescriptionCharacters,
                UnicodeCJKSymbolsandPunctuation,
                UnicodeHiragana,
                UnicodeKatakana,
                UnicodeBopomofo,
                UnicodeHangulCompatibilityJamo,
                UnicodeKanbun,
                UnicodeBopomofoExtended,
                UnicodeEnclosedCJKLettersandMonths,
                UnicodeCJKCompatibility,
                UnicodeCJKUnifiedIdeographsExtensionA,
                UnicodeCJKUnifiedIdeographs,
                UnicodeYiSyllables,
                UnicodeYiRadicals,
                UnicodeHangulSyllables,
                UnicodeHighSurrogates,
                UnicodeHighPrivateUseSurrogates,
                UnicodeLowSurrogates,
                UnicodePrivateUse,
                UnicodeCJKCompatibilityIdeographs,
                UnicodeAlphabeticPresentationForms,
                UnicodeArabicPresentationFormsA,                // *
                UnicodeCombiningHalfMarks,
                UnicodeCJKCompatibilityForms,
                UnicodeSmallFormVariants,
                UnicodeArabicPresentationFormsB,                // *
                UnicodeSpecials,
                UnicodeHalfwidthandFullwidthForms,
                
                UnicodeOldItalic,
                UnicodeGothic,
                UnicodeDeseret,
                UnicodeByzantineMusicalSymbols,
                UnicodeMusicalSymbols,
                UnicodeMathematicalAlphanumericSymbols,
                UnicodeCJKUnifiedIdeographsExtensionB,
                UnicodeCJKCompatibilityIdeographsSupplement,
                UnicodeTags
        }

        class CategoryUtils {
                public static Category CategoryFromName (string name) {
                        try {
                                if (name.Substring (0, 2).Equals ("Is"))        
// remove prefix from block range
                                        name = name.Substring (2);

                                return (Category)Enum.Parse (typeof (Category), 
"Unicode" + name);
                        }
                        catch (ArgumentException) {
                                return Category.None;
                        }
                }
        
                public static bool IsCategory (Category cat, char c) {
                        switch (cat) {
                        case Category.None:
                                return false;
                        
                        case Category.Any:
                                return c != '\n';

                        case Category.AnySingleline:
                                return true;

                        case Category.Word:
                                return
                                        Char.IsLetterOrDigit (c) ||
                                        IsCategory 
(UnicodeCategory.ConnectorPunctuation, c);

                        case Category.Digit:
                                return Char.IsDigit (c);

                        case Category.WhiteSpace:
                                return Char.IsWhiteSpace (c);

                        // ECMA categories

                        case Category.EcmaAny:
                                return c != '\n';
                                
                        case Category.EcmaAnySingleline:
                                return true;

                        case Category.EcmaWord:
                                return
                                        'a' <= c && c <= 'z' ||
                                        'A' <= c && c <= 'Z' ||
                                        '0' <= c && c <= '9' ||
                                        '_' == c;

                        case Category.EcmaDigit:
                                return
                                        '0' <= c && c <= 9;
                        
                        case Category.EcmaWhiteSpace:
                                return
                                        c == ' '  ||
                                        c == '\f' ||
                                        c == '\n' ||
                                        c == '\r' ||
                                        c == '\t' ||
                                        c == '\v';

                        // Unicode categories...

                        // letter
                        
                        case Category.UnicodeLu: return IsCategory 
(UnicodeCategory.UppercaseLetter, c);
                        case Category.UnicodeLl: return IsCategory 
(UnicodeCategory.LowercaseLetter, c);
                        case Category.UnicodeLt: return IsCategory 
(UnicodeCategory.TitlecaseLetter, c);
                        case Category.UnicodeLm: return IsCategory 
(UnicodeCategory.ModifierLetter, c);
                        case Category.UnicodeLo: return IsCategory 
(UnicodeCategory.OtherLetter, c);

                        // mark

                        case Category.UnicodeMn: return IsCategory 
(UnicodeCategory.NonSpacingMark, c);
                        case Category.UnicodeMe: return IsCategory 
(UnicodeCategory.EnclosingMark, c);
                        case Category.UnicodeMc: return IsCategory 
(UnicodeCategory.SpacingCombiningMark, c);
                        case Category.UnicodeNd: return IsCategory 
(UnicodeCategory.DecimalDigitNumber, c);

                        // number

                        case Category.UnicodeNl: return IsCategory 
(UnicodeCategory.LetterNumber, c);
                        case Category.UnicodeNo: return IsCategory 
(UnicodeCategory.OtherNumber, c);

                        // separator

                        case Category.UnicodeZs: return IsCategory 
(UnicodeCategory.SpaceSeparator, c);
                        case Category.UnicodeZl: return IsCategory 
(UnicodeCategory.LineSeparator, c);
                        case Category.UnicodeZp: return IsCategory 
(UnicodeCategory.ParagraphSeparator, c);

                        // punctuation

                        case Category.UnicodePd: return IsCategory 
(UnicodeCategory.DashPunctuation, c);
                        case Category.UnicodePs: return IsCategory 
(UnicodeCategory.OpenPunctuation, c);
                        case Category.UnicodePi: return IsCategory 
(UnicodeCategory.InitialQuotePunctuation, c);
                        case Category.UnicodePe: return IsCategory 
(UnicodeCategory.ClosePunctuation, c);
                        case Category.UnicodePf: return IsCategory 
(UnicodeCategory.FinalQuotePunctuation, c);
                        case Category.UnicodePc: return IsCategory 
(UnicodeCategory.ConnectorPunctuation, c);
                        case Category.UnicodePo: return IsCategory 
(UnicodeCategory.OtherPunctuation, c);

                        // symbol

                        case Category.UnicodeSm: return IsCategory 
(UnicodeCategory.MathSymbol, c);
                        case Category.UnicodeSc: return IsCategory 
(UnicodeCategory.CurrencySymbol, c);
                        case Category.UnicodeSk: return IsCategory 
(UnicodeCategory.ModifierSymbol, c);
                        case Category.UnicodeSo: return IsCategory 
(UnicodeCategory.OtherSymbol, c);

                        // other

                        case Category.UnicodeCc: return IsCategory 
(UnicodeCategory.Control, c);
                        case Category.UnicodeCf: return IsCategory 
(UnicodeCategory.Format, c);
                        case Category.UnicodeCo: return IsCategory 
(UnicodeCategory.PrivateUse, c);
                        case Category.UnicodeCs: return IsCategory 
(UnicodeCategory.Surrogate, c);
                        case Category.UnicodeCn: return IsCategory 
(UnicodeCategory.OtherNotAssigned, c); 

                        case Category.UnicodeL: // letter
                                return
                                        IsCategory 
(UnicodeCategory.UppercaseLetter, c) ||
                                        IsCategory 
(UnicodeCategory.LowercaseLetter, c) ||
                                        IsCategory 
(UnicodeCategory.TitlecaseLetter, c) ||
                                        IsCategory 
(UnicodeCategory.ModifierLetter, c) ||
                                        IsCategory 
(UnicodeCategory.OtherLetter, c);
                        
                        case Category.UnicodeM: // mark
                                return
                                        IsCategory 
(UnicodeCategory.NonSpacingMark, c) ||
                                        IsCategory 
(UnicodeCategory.EnclosingMark, c) ||
                                        IsCategory 
(UnicodeCategory.SpacingCombiningMark, c);

                        case Category.UnicodeN: // number
                                return
                                        IsCategory 
(UnicodeCategory.DecimalDigitNumber, c) ||
                                        IsCategory 
(UnicodeCategory.LetterNumber, c) ||
                                        IsCategory 
(UnicodeCategory.OtherNumber, c);

                        case Category.UnicodeZ: // separator
                                return
                                        IsCategory 
(UnicodeCategory.SpaceSeparator, c) ||
                                        IsCategory 
(UnicodeCategory.LineSeparator, c) ||
                                        IsCategory 
(UnicodeCategory.ParagraphSeparator, c);
                                        
                        case Category.UnicodeP: // punctuation
                                return
                                        IsCategory 
(UnicodeCategory.DashPunctuation, c) ||
                                        IsCategory 
(UnicodeCategory.OpenPunctuation, c) ||
                                        IsCategory 
(UnicodeCategory.InitialQuotePunctuation, c) ||
                                        IsCategory 
(UnicodeCategory.ClosePunctuation, c) ||
                                        IsCategory 
(UnicodeCategory.FinalQuotePunctuation, c) ||
                                        IsCategory 
(UnicodeCategory.ConnectorPunctuation, c) ||
                                        IsCategory 
(UnicodeCategory.OtherPunctuation, c);
                        
                        case Category.UnicodeS: // symbol
                                return
                                        IsCategory (UnicodeCategory.MathSymbol, 
c) ||
                                        IsCategory 
(UnicodeCategory.CurrencySymbol, c) ||
                                        IsCategory 
(UnicodeCategory.ModifierSymbol, c) ||
                                        IsCategory 
(UnicodeCategory.OtherSymbol, c);
                        
                        case Category.UnicodeC: // other
                                return
                                        IsCategory (UnicodeCategory.Control, c) 
||
                                        IsCategory (UnicodeCategory.Format, c) 
||
                                        IsCategory (UnicodeCategory.PrivateUse, 
c) ||
                                        IsCategory (UnicodeCategory.Surrogate, 
c) ||
                                        IsCategory 
(UnicodeCategory.OtherNotAssigned, c);

                        // Unicode block ranges...

                        case Category.UnicodeBasicLatin:
                                return '\u0000' <= c && c <= '\u007F';

                        case Category.UnicodeLatin1Supplement:
                                return '\u0080' <= c && c <= '\u00FF';

                        case Category.UnicodeLatinExtendedA:
                                return '\u0100' <= c && c <= '\u017F';

                        case Category.UnicodeLatinExtendedB:
                                return '\u0180' <= c && c <= '\u024F';

                        case Category.UnicodeIPAExtensions:
                                return '\u0250' <= c && c <= '\u02AF';

                        case Category.UnicodeSpacingModifierLetters:
                                return '\u02B0' <= c && c <= '\u02FF';

                        case Category.UnicodeCombiningDiacriticalMarks:
                                return '\u0300' <= c && c <= '\u036F';

                        case Category.UnicodeGreek:
                                return '\u0370' <= c && c <= '\u03FF';

                        case Category.UnicodeCyrillic:
                                return '\u0400' <= c && c <= '\u04FF';

                        case Category.UnicodeArmenian:
                                return '\u0530' <= c && c <= '\u058F';

                        case Category.UnicodeHebrew:
                                return '\u0590' <= c && c <= '\u05FF';

                        case Category.UnicodeArabic:
                                return '\u0600' <= c && c <= '\u06FF';

                        case Category.UnicodeSyriac:
                                return '\u0700' <= c && c <= '\u074F';

                        case Category.UnicodeThaana:
                                return '\u0780' <= c && c <= '\u07BF';

                        case Category.UnicodeDevanagari:
                                return '\u0900' <= c && c <= '\u097F';

                        case Category.UnicodeBengali:
                                return '\u0980' <= c && c <= '\u09FF';

                        case Category.UnicodeGurmukhi:
                                return '\u0A00' <= c && c <= '\u0A7F';

                        case Category.UnicodeGujarati:
                                return '\u0A80' <= c && c <= '\u0AFF';

                        case Category.UnicodeOriya:
                                return '\u0B00' <= c && c <= '\u0B7F';

                        case Category.UnicodeTamil:
                                return '\u0B80' <= c && c <= '\u0BFF';

                        case Category.UnicodeTelugu:
                                return '\u0C00' <= c && c <= '\u0C7F';

                        case Category.UnicodeKannada:
                                return '\u0C80' <= c && c <= '\u0CFF';

                        case Category.UnicodeMalayalam:
                                return '\u0D00' <= c && c <= '\u0D7F';

                        case Category.UnicodeSinhala:
                                return '\u0D80' <= c && c <= '\u0DFF';

                        case Category.UnicodeThai:
                                return '\u0E00' <= c && c <= '\u0E7F';

                        case Category.UnicodeLao:
                                return '\u0E80' <= c && c <= '\u0EFF';

                        case Category.UnicodeTibetan:
                                return '\u0F00' <= c && c <= '\u0FFF';

                        case Category.UnicodeMyanmar:
                                return '\u1000' <= c && c <= '\u109F';

                        case Category.UnicodeGeorgian:
                                return '\u10A0' <= c && c <= '\u10FF';

                        case Category.UnicodeHangulJamo:
                                return '\u1100' <= c && c <= '\u11FF';

                        case Category.UnicodeEthiopic:
                                return '\u1200' <= c && c <= '\u137F';

                        case Category.UnicodeCherokee:
                                return '\u13A0' <= c && c <= '\u13FF';

                        case Category.UnicodeUnifiedCanadianAboriginalSyllabics:
                                return '\u1400' <= c && c <= '\u167F';

                        case Category.UnicodeOgham:
                                return '\u1680' <= c && c <= '\u169F';

                        case Category.UnicodeRunic:
                                return '\u16A0' <= c && c <= '\u16FF';

                        case Category.UnicodeKhmer:
                                return '\u1780' <= c && c <= '\u17FF';

                        case Category.UnicodeMongolian:
                                return '\u1800' <= c && c <= '\u18AF';

                        case Category.UnicodeLatinExtendedAdditional:
                                return '\u1E00' <= c && c <= '\u1EFF';

                        case Category.UnicodeGreekExtended:
                                return '\u1F00' <= c && c <= '\u1FFF';

                        case Category.UnicodeGeneralPunctuation:
                                return '\u2000' <= c && c <= '\u206F';

                        case Category.UnicodeSuperscriptsandSubscripts:
                                return '\u2070' <= c && c <= '\u209F';

                        case Category.UnicodeCurrencySymbols:
                                return '\u20A0' <= c && c <= '\u20CF';

                        case Category.UnicodeCombiningMarksforSymbols:
                                return '\u20D0' <= c && c <= '\u20FF';

                        case Category.UnicodeLetterlikeSymbols:
                                return '\u2100' <= c && c <= '\u214F';

                        case Category.UnicodeNumberForms:
                                return '\u2150' <= c && c <= '\u218F';

                        case Category.UnicodeArrows:
                                return '\u2190' <= c && c <= '\u21FF';

                        case Category.UnicodeMathematicalOperators:
                                return '\u2200' <= c && c <= '\u22FF';

                        case Category.UnicodeMiscellaneousTechnical:
                                return '\u2300' <= c && c <= '\u23FF';

                        case Category.UnicodeControlPictures:
                                return '\u2400' <= c && c <= '\u243F';

                        case Category.UnicodeOpticalCharacterRecognition:
                                return '\u2440' <= c && c <= '\u245F';

                        case Category.UnicodeEnclosedAlphanumerics:
                                return '\u2460' <= c && c <= '\u24FF';

                        case Category.UnicodeBoxDrawing:
                                return '\u2500' <= c && c <= '\u257F';

                        case Category.UnicodeBlockElements:
                                return '\u2580' <= c && c <= '\u259F';

                        case Category.UnicodeGeometricShapes:
                                return '\u25A0' <= c && c <= '\u25FF';

                        case Category.UnicodeMiscellaneousSymbols:
                                return '\u2600' <= c && c <= '\u26FF';

                        case Category.UnicodeDingbats:
                                return '\u2700' <= c && c <= '\u27BF';

                        case Category.UnicodeBraillePatterns:
                                return '\u2800' <= c && c <= '\u28FF';

                        case Category.UnicodeCJKRadicalsSupplement:
                                return '\u2E80' <= c && c <= '\u2EFF';

                        case Category.UnicodeKangxiRadicals:
                                return '\u2F00' <= c && c <= '\u2FDF';

                        case Category.UnicodeIdeographicDescriptionCharacters:
                                return '\u2FF0' <= c && c <= '\u2FFF';

                        case Category.UnicodeCJKSymbolsandPunctuation:
                                return '\u3000' <= c && c <= '\u303F';

                        case Category.UnicodeHiragana:
                                return '\u3040' <= c && c <= '\u309F';

                        case Category.UnicodeKatakana:
                                return '\u30A0' <= c && c <= '\u30FF';

                        case Category.UnicodeBopomofo:
                                return '\u3100' <= c && c <= '\u312F';

                        case Category.UnicodeHangulCompatibilityJamo:
                                return '\u3130' <= c && c <= '\u318F';

                        case Category.UnicodeKanbun:
                                return '\u3190' <= c && c <= '\u319F';

                        case Category.UnicodeBopomofoExtended:
                                return '\u31A0' <= c && c <= '\u31BF';

                        case Category.UnicodeEnclosedCJKLettersandMonths:
                                return '\u3200' <= c && c <= '\u32FF';

                        case Category.UnicodeCJKCompatibility:
                                return '\u3300' <= c && c <= '\u33FF';

                        case Category.UnicodeCJKUnifiedIdeographsExtensionA:
                                return '\u3400' <= c && c <= '\u4DB5';

                        case Category.UnicodeCJKUnifiedIdeographs:
                                return '\u4E00' <= c && c <= '\u9FFF';

                        case Category.UnicodeYiSyllables:
                                return '\uA000' <= c && c <= '\uA48F';

                        case Category.UnicodeYiRadicals:
                                return '\uA490' <= c && c <= '\uA4CF';

                        case Category.UnicodeHangulSyllables:
                                return '\uAC00' <= c && c <= '\uD7A3';

                        case Category.UnicodeHighSurrogates:
                                return '\uD800' <= c && c <= '\uDB7F';

                        case Category.UnicodeHighPrivateUseSurrogates:
                                return '\uDB80' <= c && c <= '\uDBFF';

                        case Category.UnicodeLowSurrogates:
                                return '\uDC00' <= c && c <= '\uDFFF';

                        case Category.UnicodePrivateUse:
                                return '\uE000' <= c && c <= '\uF8FF';

                        case Category.UnicodeCJKCompatibilityIdeographs:
                                return '\uF900' <= c && c <= '\uFAFF';

                        case Category.UnicodeAlphabeticPresentationForms:
                                return '\uFB00' <= c && c <= '\uFB4F';

                        case Category.UnicodeArabicPresentationFormsA:
                                return '\uFB50' <= c && c <= '\uFDFF';

                        case Category.UnicodeCombiningHalfMarks:
                                return '\uFE20' <= c && c <= '\uFE2F';

                        case Category.UnicodeCJKCompatibilityForms:
                                return '\uFE30' <= c && c <= '\uFE4F';

                        case Category.UnicodeSmallFormVariants:
                                return '\uFE50' <= c && c <= '\uFE6F';

                        case Category.UnicodeArabicPresentationFormsB:
                                return '\uFE70' <= c && c <= '\uFEFE';

                        case Category.UnicodeHalfwidthandFullwidthForms:
                                return '\uFF00' <= c && c <= '\uFFEF';

                        case Category.UnicodeSpecials:
                                return
                                        '\uFEFF' <= c && c <= '\uFEFF' ||
                                        '\uFFF0' <= c && c <= '\uFFFD';

                        // these block ranges begin above 0x10000

                        case Category.UnicodeOldItalic:
                        case Category.UnicodeGothic:
                        case Category.UnicodeDeseret:
                        case Category.UnicodeByzantineMusicalSymbols:
                        case Category.UnicodeMusicalSymbols:
                        case Category.UnicodeMathematicalAlphanumericSymbols:
                        case Category.UnicodeCJKUnifiedIdeographsExtensionB:
                        case 
Category.UnicodeCJKCompatibilityIdeographsSupplement:
                        case Category.UnicodeTags:
                                return false;

                        default:
                                return false;
                        }
                }

                private static bool IsCategory (UnicodeCategory uc, char c) {
                        if (Char.GetUnicodeCategory (c) == uc)
                                return true;

                        return false;
                }
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        collections.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Collections;

namespace System.Text.RegularExpressions {
        public abstract class RegexCollectionBase : ICollection, IEnumerable {
                public int Count {
                        get { return list.Count; }
                }

                public bool IsReadOnly {
                        get { return true; }    // FIXME
                }

                public bool IsSynchronized {
                        get { return false; }   // FIXME
                }

                public object SyncRoot {
                        get { return list; }    // FIXME
                }

                public void CopyTo (Array array, int index) {
                        foreach (Object o in list) {
                                if (index > array.Length)
                                        break;
                                
                                array.SetValue (o, index ++);
                        }
                }

                public IEnumerator GetEnumerator () {
                        return new Enumerator (list);
                }

                // internal methods

                internal RegexCollectionBase () {
                        list = new ArrayList ();
                }

                internal void Add (Object o) {
                        list.Add (o);
                }

                internal void Reverse () {
                        list.Reverse ();
                }

                // IEnumerator implementation

                private class Enumerator : IEnumerator {
                        public Enumerator (IList list) {
                                this.list = list;
                                Reset ();
                        }

                        public object Current {
                                get {
                                        if (ptr >= list.Count)
                                                throw new 
InvalidOperationException ();

                                        return list[ptr];
                                }
                        }

                        public bool MoveNext () {
                                if (ptr > list.Count)
                                        throw new InvalidOperationException ();
                                
                                return ++ ptr < list.Count;
                        }

                        public void Reset () {
                                ptr = -1;
                        }

                        private IList list;
                        private int ptr;
                }

                // protected fields

                protected ArrayList list;
        }

        [Serializable]
        public class CaptureCollection : RegexCollectionBase, ICollection, 
IEnumerable {
                public Capture this[int i] {
                        get { return (Capture)list[i]; }
                }

                internal CaptureCollection () {
                }
        }

        [Serializable]
        public class GroupCollection : RegexCollectionBase, ICollection, 
IEnumerable {
                public Group this[int i] {
                        get { return (Group)list[i]; }
                }
                
                internal GroupCollection () {
                }
        }

        [Serializable]
        public class MatchCollection : RegexCollectionBase, ICollection, 
IEnumerable {
                public virtual Match this[int i] {
                        get { return (Match)list[i]; }
                }

                internal MatchCollection () {
                }
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        compiler.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Collections;

namespace System.Text.RegularExpressions {
        abstract class LinkRef {
                // empty
        }
                
        interface ICompiler {
                void Reset ();
                IMachineFactory GetMachineFactory ();

                // instruction emission

                void EmitFalse ();
                void EmitTrue ();

                // character matching

                void EmitCharacter (char c, bool negate, bool ignore, bool 
reverse);
                void EmitCategory (Category cat, bool negate, bool reverse);
                void EmitRange (char lo, char hi, bool negate, bool ignore, 
bool reverse);
                void EmitSet (char lo, BitArray set, bool negate, bool ignore, 
bool reverse);

                // other operators

                void EmitString (string str, bool ignore, bool reverse);
                void EmitPosition (Position pos);
                void EmitOpen (int gid);
                void EmitClose (int gid);
                void EmitBalance (int gid, int balance);
                void EmitReference (int gid, bool ignore, bool reverse);

                // constructs

                void EmitIfDefined (int gid, LinkRef tail);
                void EmitSub (LinkRef tail);
                void EmitTest (LinkRef yes, LinkRef tail);
                void EmitBranch (LinkRef next);
                void EmitJump (LinkRef target);
                void EmitRepeat (int min, int max, bool lazy, LinkRef until);
                void EmitUntil (LinkRef repeat);
                void EmitIn (LinkRef tail);
                void EmitInfo (int count, int min, int max);
                void EmitFastRepeat (int min, int max, bool lazy, LinkRef tail);
                void EmitAnchor (int offset, LinkRef tail);

                LinkRef NewLink ();
                void ResolveLink (LinkRef link);
        }

        class InterpreterFactory : IMachineFactory {
                public InterpreterFactory (ushort[] pattern) {
                        this.pattern = pattern;
                }
                
                public IMachine NewInstance () {
                        return new Interpreter (pattern);
                }

                public int GroupCount {
                        get { return pattern[0]; }
                }

                public IDictionary Mapping {
                        get { return mapping; }
                        set { mapping = value; }
                }

                private IDictionary mapping;
                private ushort[] pattern;
        }

        class PatternCompiler : ICompiler {
                public static ushort EncodeOp (OpCode op, OpFlags flags) {
                        return (ushort)((int)op | ((int)flags & 0xff00));
                }

                public static void DecodeOp (ushort word, out OpCode op, out 
OpFlags flags) {
                        op = (OpCode)(word & 0x00ff);
                        flags = (OpFlags)(word & 0xff00);
                }

                public PatternCompiler () {
                        pgm = new ArrayList ();
                }

                // ICompiler implementation

                public void Reset () {
                        pgm.Clear ();
                }

                public IMachineFactory GetMachineFactory () {
                        ushort[] image = new ushort[pgm.Count];
                        pgm.CopyTo (image);

                        return new InterpreterFactory (image);
                }

                public void EmitFalse () {
                        Emit (OpCode.False);
                }

                public void EmitTrue () {
                        Emit (OpCode.True);
                }

                public void EmitCharacter (char c, bool negate, bool ignore, 
bool reverse) {
                        Emit (OpCode.Character, MakeFlags (negate, ignore, 
reverse, false));

                        if (ignore)
                                c = Char.ToLower (c);

                        Emit ((ushort)c);
                }

                public void EmitCategory (Category cat, bool negate, bool 
reverse) {
                        Emit (OpCode.Category, MakeFlags (negate, false, 
reverse, false));
                        Emit ((ushort)cat);
                }

                public void EmitRange (char lo, char hi, bool negate, bool 
ignore, bool reverse) {
                        Emit (OpCode.Range, MakeFlags (negate, ignore, reverse, 
false));
                        Emit ((ushort)lo);
                        Emit ((ushort)hi);
                }

                public void EmitSet (char lo, BitArray set, bool negate, bool 
ignore, bool reverse) {
                        Emit (OpCode.Set, MakeFlags (negate, ignore, reverse, 
false));
                        Emit ((ushort)lo);

                        int len = (set.Length + 0xf) >> 4;
                        Emit ((ushort)len);

                        int b = 0;
                        while (len -- != 0) {
                                ushort word = 0;
                                for (int i = 0; i < 16; ++ i) {
                                        if (b >= set.Length)
                                                break;
                                
                                        if (set[b ++])
                                                word |= (ushort)(1 << i);
                                }

                                Emit (word);
                        }
                }

                public void EmitString (string str, bool ignore, bool reverse) {
                        Emit (OpCode.String, MakeFlags (false, ignore, reverse, 
false));
                        int len = str.Length;
                        Emit ((ushort)len);

                        if (ignore)
                                str = str.ToLower ();
                        
                        for (int i = 0; i < len; ++ i)
                                Emit ((ushort)str[i]);
                }

                public void EmitPosition (Position pos) {
                        Emit (OpCode.Position, 0);
                        Emit ((ushort)pos);
                }

                public void EmitOpen (int gid) {
                        Emit (OpCode.Open);
                        Emit ((ushort)gid);
                }

                public void EmitClose (int gid) {
                        Emit (OpCode.Close);
                        Emit ((ushort)gid);
                }

                public void EmitBalance (int gid, int balance) {
                        Emit (OpCode.Balance);
                        Emit ((ushort)gid);
                        Emit ((ushort)balance);
                }

                public void EmitReference (int gid, bool ignore, bool reverse) {
                        Emit (OpCode.Reference, MakeFlags (false, ignore, 
reverse, false));
                        Emit ((ushort)gid);
                }

                public void EmitIfDefined (int gid, LinkRef tail) {
                        BeginLink (tail);
                        Emit (OpCode.IfDefined);
                        EmitLink (tail);
                        Emit ((ushort)gid);
                }

                public void EmitSub (LinkRef tail) {
                        BeginLink (tail);
                        Emit (OpCode.Sub);
                        EmitLink (tail);
                }

                public void EmitTest (LinkRef yes, LinkRef tail) {
                        BeginLink (yes);
                        BeginLink (tail);
                        Emit (OpCode.Test);
                        EmitLink (yes);
                        EmitLink (tail);
                }

                public void EmitBranch (LinkRef next) {
                        BeginLink (next);
                        Emit (OpCode.Branch, 0);
                        EmitLink (next);
                }

                public void EmitJump (LinkRef target) {
                        BeginLink (target);
                        Emit (OpCode.Jump, 0);
                        EmitLink (target);
                }

                public void EmitRepeat (int min, int max, bool lazy, LinkRef 
until) {
                        BeginLink (until);
                        Emit (OpCode.Repeat, MakeFlags (false, false, false, 
lazy));
                        EmitLink (until);
                        Emit ((ushort)min);
                        Emit ((ushort)max);
                }

                public void EmitUntil (LinkRef repeat) {
                        ResolveLink (repeat);
                        Emit (OpCode.Until);
                }

                public void EmitFastRepeat (int min, int max, bool lazy, 
LinkRef tail) {
                        BeginLink (tail);
                        Emit (OpCode.FastRepeat, MakeFlags (false, false, 
false, lazy));
                        EmitLink (tail);
                        Emit ((ushort)min);
                        Emit ((ushort)max);
                }

                public void EmitIn (LinkRef tail) {
                        BeginLink (tail);
                        Emit (OpCode.In);
                        EmitLink (tail);
                }

                public void EmitAnchor (int offset, LinkRef tail) {
                        BeginLink (tail);
                        Emit (OpCode.Anchor);
                        EmitLink (tail);
                        Emit ((ushort)offset);
                }

                public void EmitInfo (int count, int min, int max) {
                        Emit (OpCode.Info);
                        Emit ((ushort)count);
                        Emit ((ushort)min);
                        Emit ((ushort)max);
                }

                public LinkRef NewLink () {
                        return new PatternLinkStack ();
                }
                
                public void ResolveLink (LinkRef lref) {
                        PatternLinkStack stack = (PatternLinkStack)lref;
                
                        while (stack.Pop ())
                                pgm[stack.OffsetAddress] = 
(ushort)stack.GetOffset (CurrentAddress);
                }

                // private members

                private static OpFlags MakeFlags (bool negate, bool ignore, 
bool reverse, bool lazy) {
                        OpFlags flags = 0;
                        if (negate) flags |= OpFlags.Negate;
                        if (ignore) flags |= OpFlags.IgnoreCase;
                        if (reverse) flags |= OpFlags.RightToLeft;
                        if (lazy) flags |= OpFlags.Lazy;

                        return flags;
                }
                
                private void Emit (OpCode op) {
                        Emit (op, (OpFlags)0);
                }

                private void Emit (OpCode op, OpFlags flags) {
                        Emit (EncodeOp (op, flags));
                }

                private void Emit (ushort word) {
                        pgm.Add (word);
                }

                private int CurrentAddress {
                        get { return pgm.Count; }
                }

                private void BeginLink (LinkRef lref) {
                        PatternLinkStack stack = (PatternLinkStack)lref;
                        stack.BaseAddress = CurrentAddress;
                }

                private void EmitLink (LinkRef lref) {
                        PatternLinkStack stack = (PatternLinkStack)lref;
                        stack.OffsetAddress = CurrentAddress;
                        Emit ((ushort)0);       // placeholder
                        stack.Push ();
                }

                private class PatternLinkStack : LinkStack {
                        public PatternLinkStack () {
                        }
                
                        public int BaseAddress {
                                set { link.base_addr = value; }
                        }

                        public int OffsetAddress {
                                get { return link.offset_addr; }
                                set { link.offset_addr = value; }
                        }

                        public int GetOffset (int target_addr) {
                                return target_addr - link.base_addr;
                        }

                        // LinkStack implementation

                        protected override object GetCurrent () { return link; }
                        protected override void SetCurrent (object l) { link = 
(Link)l; }

                        private struct Link {
                                public int base_addr;
                                public int offset_addr;
                        }

                        Link link;
                }

                private ArrayList pgm;
        }

        abstract class LinkStack : LinkRef {
                public LinkStack () {
                        stack = new Stack ();
                }

                public void Push () {
                        stack.Push (GetCurrent ());
                }

                public bool Pop () {
                        if (stack.Count > 0) {
                                SetCurrent (stack.Pop ());
                                return true;
                        }

                        return false;
                }

                protected abstract object GetCurrent ();
                protected abstract void SetCurrent (object l);

                private Stack stack;
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        debug.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Collections;

namespace System.Text.RegularExpressions {

        class Disassembler {
                public static void DisassemblePattern (ushort[] image) {
                        DisassembleBlock (image, 0, 0);
                }
        
                public static void DisassembleBlock (ushort[] image, int pc, 
int depth) {
                        OpCode op;
                        OpFlags flags;

                        for (;;) {
                                if (pc >= image.Length)
                                        return;
                        
                                PatternCompiler.DecodeOp (image[pc], out op, 
out flags);
                                Console.Write (FormatAddress (pc) + ": ");      
        // address
                                Console.Write (new string (' ', depth * 2));    
        // indent
                                Console.Write (DisassembleOp (image, pc));      
        // instruction
                                Console.WriteLine ();

                                int skip;
                                switch (op) {
                                case OpCode.False: case OpCode.True: case 
OpCode.Until:
                                        skip = 1;
                                        break;

                                case OpCode.Character: case OpCode.Category: 
case OpCode.Position:
                                case OpCode.Open: case OpCode.Close: case 
OpCode.Reference:
                                case OpCode.Sub: case OpCode.Branch: case 
OpCode.Jump: case OpCode.In:
                                        skip = 2;
                                        break;

                                case OpCode.Balance: case OpCode.IfDefined: 
case OpCode.Range:
                                case OpCode.Test: case OpCode.Anchor:
                                        skip = 3;
                                        break;

                                case OpCode.Repeat: case OpCode.FastRepeat: 
case OpCode.Info:
                                        skip = 4;
                                        break;

                                case OpCode.String: skip = image[pc + 1] + 2; 
break;
                                case OpCode.Set: skip = image[pc + 2] + 3; 
break;

                                default:
                                        skip = 1;
                                        break;
                                }

                                pc += skip;
                        }
                }

                public static string DisassembleOp (ushort[] image, int pc) {
                        OpCode op;
                        OpFlags flags;

                        PatternCompiler.DecodeOp (image[pc], out op, out flags);
                        string str = op.ToString ();
                        if (flags != 0)
                                str += "[" + flags.ToString ("f") + "]";

                        switch (op) {
                        case OpCode.False: case OpCode.True: case OpCode.Until:
                        default:
                                break;

                        case OpCode.Info:
                                str += " " + image[pc + 1];
                                str += " (" + image[pc + 2] + ", " + image[pc + 
3] + ")";
                                break;
                        
                        case OpCode.Character:
                                str += " '" + FormatChar ((char)image[pc + 1]) 
+ "'";
                                break;

                        case OpCode.Category:
                                str += " /" + (Category)image[pc + 1];
                                break;
                        
                        case OpCode.Range:
                                str += " '" + FormatChar ((char)image[pc + 1]) 
+ "', ";
                                str += " '" + FormatChar ((char)image[pc + 2]) 
+ "'";
                                break;

                        case OpCode.Set:
                                str += " " + FormatSet (image, pc + 1);
                                break;

                        case OpCode.String:
                                str += " '" + ReadString (image, pc + 1) + "'";
                                break;

                        case OpCode.Position:
                                str += " /" + (Position)image[pc + 1];
                                break;

                        case OpCode.Open: case OpCode.Close: case 
OpCode.Reference:
                                str += " " + image[pc + 1];
                                break;

                        case OpCode.Balance:
                                str += " " + image[pc + 1] + " " + image[pc + 
2];
                                break;

                        case OpCode.IfDefined: case OpCode.Anchor:
                                str += " :" + FormatAddress (pc + image[pc + 
1]);
                                str += " " + image[pc + 2];
                                break;
                        
                        case OpCode.Sub: case OpCode.Branch: case OpCode.Jump:
                        case OpCode.In:
                                str += " :" + FormatAddress (pc + image[pc + 
1]);
                                break;

                        case OpCode.Test:
                                str += " :" + FormatAddress (pc + image[pc + 
1]);
                                str += ", :" + FormatAddress (pc + image[pc + 
2]);
                                break;

                        case OpCode.Repeat: case OpCode.FastRepeat:
                                str += " :" + FormatAddress (pc + image[pc + 
1]);
                                str += " (" + image[pc + 2] + ", ";
                                if (image[pc + 3] == 0xffff)
                                        str += "Inf";
                                else
                                        str += image[pc + 3];
                                str += ")";
                                break;

                        }

                        return str;
                }

                // private static members
        
                private static string ReadString (ushort[] image, int pc) {
                        int len = image[pc];
                        char[] chars = new char[len];

                        for (int i = 0; i < len; ++ i)
                                chars[i] = (char)image[pc + i + 1];

                        return new string (chars);
                }

                private static string FormatAddress (int pc) {
                        return pc.ToString ("x4");
                }

                private static string FormatSet (ushort[] image, int pc) {
                        int lo = image[pc ++];
                        int hi = (image[pc ++] << 4) - 1;

                        string str = "[";

                        bool hot = false;
                        char a = (char)0, b;
                        for (int i = 0; i <= hi; ++ i) {
                                bool m = (image[pc + (i >> 4)] & (1 << (i & 
0xf))) != 0;

                                if (m & !hot) {                         // 
start of range
                                        a = (char)(lo + i);
                                        hot = true;
                                }
                                else if (hot & (!m || i == hi)) {       // end 
of range
                                        b = (char)(lo + i - 1);

                                        str += FormatChar (a);
                                        if (b != a)
                                                str += "-" + FormatChar (b);
                                        
                                        hot = false;
                                }
                        }

                        str += "]";
                        return str;
                }

                private static string FormatChar (char c) {
                        if (c == '-' || c == ']')
                                return "\\" + c;

                        if (Char.IsLetterOrDigit (c) || Char.IsSymbol (c))
                                return c.ToString ();
                        
                        if (Char.IsControl (c)) {
                                return "^" + (char)('@' + c);
                        }

                        return "\\u" + ((int)c).ToString ("x4");
                }
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        interpreter.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Collections;
using System.Diagnostics;
using System.Globalization;

namespace System.Text.RegularExpressions {

        class Interpreter : IMachine {
                public Interpreter (ushort[] program) {
                        this.program = program;
                        this.qs = null;

                        // process info block

                        Debug.Assert ((OpCode)program[0] == OpCode.Info, 
"Regex", "Cant' find info block");

                        this.group_count = program[1] + 1;
                        this.match_min = program[2];
                        this.match_max = program[3];

                        // setup

                        this.program_start = 4;
                        this.groups = new int [group_count];
                }

                // IMachine implementation

                public Match Scan (Regex regex, string text, int start, int 
end) {
                        this.text = text;
                        this.text_end = end;
                        this.scan_ptr = start;

                        if (Eval (Mode.Match, ref scan_ptr, program_start))
                                return GenerateMatch (regex);

                        return Match.Empty;
                }

                // private methods

                private void Reset () {
                        ResetGroups ();
                        fast = repeat = null;
                }

                private bool Eval (Mode mode, ref int ref_ptr, int pc) {
                        int ptr = ref_ptr;
                Begin:
                        for (;;) {
                                ushort word = program[pc];
                                OpCode op = (OpCode)(word & 0x00ff);
                                OpFlags flags = (OpFlags)(word & 0xff00);

                                switch (op) {
                                case OpCode.Anchor: {
                                        int skip = program[pc + 1];

                                        int anch_offset = program[pc + 2];
                                        int anch_ptr = ptr + anch_offset;
                                        int anch_end = text_end - match_min + 
anch_offset;      // maximum anchor position

                                        // the general case for an anchoring 
expression is at the bottom, however we
                                        // do some checks for the common cases 
before to save processing time. the current
                                        // optimizer only outputs three types 
of anchoring expressions: fixed position,
                                        // fixed substring, and no anchor.

                                        OpCode anch_op = (OpCode)(program[pc + 
3] & 0x00ff);
                                        if (anch_op == OpCode.Position && skip 
== 6) {                          // position anchor
                                                // Anchor
                                                //      Position
                                                //      True

                                                switch ((Position)program[pc + 
4]) {
                                                case Position.StartOfString:
                                                        if (anch_ptr == 0) {
                                                                ptr = 0;
                                                                if (TryMatch 
(ref ptr, pc + skip))
                                                                        goto 
Pass;
                                                        }
                                                        break;
                                                
                                                case Position.StartOfLine:
                                                        if (anch_ptr == 0) {
                                                                ptr = 0;
                                                                if (TryMatch 
(ref ptr, pc + skip))
                                                                        goto 
Pass;

                                                                ++ anch_ptr;
                                                        }

                                                        while (anch_ptr <= 
anch_end) {
                                                                if 
(text[anch_ptr - 1] == '\n') {
                                                                        ptr = 
anch_ptr - anch_offset;
                                                                        if 
(TryMatch (ref ptr, pc + skip))
                                                                                
goto Pass;
                                                                }

                                                                ++ anch_ptr;
                                                        }
                                                        break;
                                                
                                                case Position.StartOfScan:
                                                        if (anch_ptr == 
scan_ptr) {
                                                                ptr = scan_ptr 
- anch_offset;
                                                                if (TryMatch 
(ref ptr, pc + skip))
                                                                        goto 
Pass;
                                                        }
                                                        break;

                                                default:
                                                        // FIXME
                                                        break;
                                                }
                                        }
                                        else if (qs != null ||
                                                (anch_op == OpCode.String && 
skip == 6 + program[pc + 4])) {    // substring anchor
                                                // Anchor
                                                //      String
                                                //      True

                                                if (qs == null) {
                                                        bool ignore = 
((OpFlags)program[pc + 3] & OpFlags.IgnoreCase) != 0;
                                                        string substring = 
GetString (pc + 3);

                                                        qs = new QuickSearch 
(substring, ignore);
                                                }

                                                while (anch_ptr <= anch_end) {
                                                        anch_ptr = qs.Search 
(text, anch_ptr, anch_end);
                                                        if (anch_ptr < 0)
                                                                break;

                                                        ptr = anch_ptr - 
anch_offset;
                                                        if (TryMatch (ref ptr, 
pc + skip))
                                                                goto Pass;

                                                        ++ anch_ptr;
                                                }
                                        }
                                        else if (anch_op == OpCode.True) {      
                                // no anchor
                                                // Anchor
                                                //      True

                                                while (anch_ptr <= anch_end) {
                                                        ptr = anch_ptr;
                                                        if (TryMatch (ref ptr, 
pc + skip))
                                                                goto Pass;

                                                        ++ anch_ptr;
                                                }
                                        }
                                        else {                                  
                                // general case
                                                // Anchor
                                                //      <expr>
                                                //      True

                                                while (anch_ptr <= anch_end) {
                                                        ptr = anch_ptr;
                                                        if (Eval (Mode.Match, 
ref ptr, pc + 3)) {
                                                                // anchor 
expression passed: try real expression at the correct offset

                                                                ptr = anch_ptr 
- anch_offset;
                                                                if (TryMatch 
(ref ptr, pc + skip))
                                                                        goto 
Pass;
                                                        }

                                                        ++ anch_ptr;
                                                }
                                        }

                                        goto Fail;
                                }
                                
                                case OpCode.False: {
                                        goto Fail;
                                }

                                case OpCode.True: {
                                        goto Pass;
                                }

                                case OpCode.Position: {
                                        if (!IsPosition ((Position)program[pc + 
1], ptr))
                                                goto Fail;
                                        pc += 2;
                                        break;
                                }

                                case OpCode.String: {
                                        bool reverse = (flags & 
OpFlags.RightToLeft) != 0;
                                        bool ignore = (flags & 
OpFlags.IgnoreCase) != 0;
                                        int len = program[pc + 1];

                                        if (reverse) {
                                                ptr -= len;
                                                if (ptr < 0)
                                                        goto Fail;
                                        }
                                        else if (ptr + len > text_end)
                                                goto Fail;

                                        pc += 2;
                                        for (int i = 0; i < len; ++ i) {
                                                char c = text[ptr + i];
                                                if (ignore)
                                                        c = Char.ToLower (c);

                                                if (c != (char)program[pc ++])
                                                        goto Fail;
                                        }

                                        if (!reverse)
                                                ptr += len;
                                        break;
                                }

                                case OpCode.Reference: {
                                        bool reverse = (flags & 
OpFlags.RightToLeft) != 0;
                                        bool ignore = (flags & 
OpFlags.IgnoreCase) != 0;
                                        int m = GetLastDefined (program [pc + 
1]);
                                        if (m < 0)
                                                goto Fail;

                                        int str = marks [m].Index;
                                        int len = marks [m].Length;

                                        if (reverse) {
                                                ptr -= len;
                                                if (ptr < 0)
                                                        goto Fail;
                                        }
                                        else if (ptr + len > text_end)
                                                goto Fail;

                                        pc += 2;
                                        for (int i = 0; i < len; ++ i) {
                                                if (ignore) {
                                                        if (Char.ToLower 
(text[ptr + i]) != Char.ToLower (text[str + i]))
                                                                goto Fail;
                                                }
                                                else {
                                                        if (text[ptr + i] != 
text[str + i])
                                                                goto Fail;
                                                }
                                        }

                                        if (!reverse)
                                                ptr += len;
                                        break;
                                }

                                case OpCode.Character: case OpCode.Category:
                                case OpCode.Range: case OpCode.Set: {
                                        if (!EvalChar (mode, ref ptr, ref pc, 
false))
                                                goto Fail;
                                        break;
                                }

                                case OpCode.In: {
                                        int target = pc + program[pc + 1];
                                        pc += 2;
                                        if (!EvalChar (mode, ref ptr, ref pc, 
true))
                                                goto Fail;

                                        pc = target;
                                        break;
                                }

                                case OpCode.Open: {
                                        Open (program[pc + 1], ptr);
                                        pc += 2;
                                        break;
                                }

                                case OpCode.Close: {
                                        Close (program[pc + 1], ptr);
                                        pc += 2;
                                        break;
                                }

                                case OpCode.Balance: {
                                        Balance (program[pc + 1], program[pc + 
2], ptr);
                                        break;
                                }

                                case OpCode.IfDefined: {
                                        int m = GetLastDefined (program [pc + 
2]);
                                        if (m < 0)
                                                pc += program[pc + 1];
                                        else
                                                pc += 3;
                                        break;
                                }

                                case OpCode.Sub: {
                                        if (!Eval (Mode.Match, ref ptr, pc + 2))
                                                goto Fail;

                                        pc += program[pc + 1];
                                        break;
                                }

                                case OpCode.Test: {
                                        int cp = Checkpoint ();
                                        int test_ptr = ptr;
                                        if (Eval (Mode.Match, ref test_ptr, pc 
+ 3))
                                                pc += program[pc + 1];
                                        else {
                                                Backtrack (cp);
                                                pc += program[pc + 2];
                                        }
                                        break;
                                }

                                case OpCode.Branch: {
                                        OpCode branch_op;
                                        do {
                                                int cp = Checkpoint ();
                                                if (Eval (Mode.Match, ref ptr, 
pc + 2))
                                                        goto Pass;
                                                
                                                Backtrack (cp);
                                                
                                                pc += program[pc + 1];
                                                branch_op = 
(OpCode)(program[pc] & 0xff);
                                        } while (branch_op != OpCode.False);

                                        goto Fail;
                                }

                                case OpCode.Jump: {
                                        pc += program[pc + 1];
                                        break;
                                }

                                case OpCode.Repeat: {
                                        this.repeat = new RepeatContext (
                                                this.repeat,                    
// previous context
                                                program[pc + 2],                
// minimum
                                                program[pc + 3],                
// maximum
                                                (flags & OpFlags.Lazy) != 0,    
// lazy
                                                pc + 4                          
// subexpression
                                        );

                                        if (Eval (Mode.Match, ref ptr, pc + 
program[pc + 1]))
                                                goto Pass;
                                        else {
                                                this.repeat = 
this.repeat.Previous;
                                                goto Fail;
                                        }
                                }

                                case OpCode.Until: {
                                        RepeatContext current = this.repeat;
                                        int start = current.Start;

                                        if (!current.IsMinimum) {
                                                ++ current.Count;
                                                current.Start = ptr;
                                                if (Eval (Mode.Match, ref ptr, 
repeat.Expression))
                                                        goto Pass;

                                                current.Start = start;
                                                -- current.Count;
                                                goto Fail;
                                        }

                                        if (ptr == current.Start) {
                                                // degenerate match ... match 
tail or fail

                                                this.repeat = current.Previous;
                                                if (Eval (Mode.Match, ref ptr, 
pc + 1))
                                                        goto Pass;
                                        
                                                goto Fail;
                                        }

                                        if (current.IsLazy) {
                                                // match tail first ...

                                                this.repeat = current.Previous;
                                                int cp = Checkpoint ();
                                                if (Eval (Mode.Match, ref ptr, 
pc + 1))
                                                        goto Pass;

                                                Backtrack (cp);

                                                // ... then match more

                                                this.repeat = current;
                                                if (!current.IsMaximum) {
                                                        ++ current.Count;
                                                        current.Start = ptr;
                                                        if (Eval (Mode.Match, 
ref ptr, current.Expression))
                                                                goto Pass;

                                                        current.Start = start;
                                                        -- current.Count;
                                                        goto Fail;
                                                }

                                                return false;
                                        }
                                        else {
                                                // match more first ...

                                                if (!current.IsMaximum) {
                                                        int cp = Checkpoint ();
                                                        ++ current.Count;
                                                        current.Start = ptr;
                                                        if (Eval (Mode.Match, 
ref ptr, current.Expression))
                                                                goto Pass;

                                                        current.Start = start;
                                                        -- current.Count;
                                                        Backtrack (cp);
                                                }

                                                // ... then match tail

                                                this.repeat = current.Previous;
                                                if (Eval (Mode.Match, ref ptr, 
pc + 1))
                                                        goto Pass;

                                                this.repeat = current;
                                                goto Fail;
                                        }
                                }

                                case OpCode.FastRepeat: {
                                        this.fast = new RepeatContext (
                                                fast,
                                                program[pc + 2],                
// minimum
                                                program[pc + 3],                
// maximum
                                                (flags & OpFlags.Lazy) != 0,    
// lazy
                                                pc + 4                          
// subexpression
                                        );
                                        fast.Start = ptr;

                                        int cp = Checkpoint ();

                                        pc += program[pc + 1];          // tail 
expression
                                        ushort tail_word = program[pc];

                                        int c1, c2;                     // 
first character of tail operator
                                        int coff;                       // 0 or 
-1 depending on direction

                                        OpCode tail_op = (OpCode)(tail_word & 
0xff);
                                        if (tail_op == OpCode.Character || 
tail_op == OpCode.String) {
                                                OpFlags tail_flags = 
(OpFlags)(tail_word & 0xff00);

                                                if (tail_op == OpCode.String)
                                                        c1 = program[pc + 2];   
                        // first char of string
                                                else
                                                        c1 = program[pc + 1];   
                        // character
                                                
                                                if ((tail_flags & 
OpFlags.IgnoreCase) != 0)
                                                        c2 = Char.ToUpper 
((char)c1);                   // ignore case
                                                else
                                                        c2 = c1;

                                                if ((tail_flags & 
OpFlags.RightToLeft) != 0)
                                                        coff = -1;              
                        // reverse
                                                else
                                                        coff = 0;
                                        }
                                        else {
                                                c1 = c2 = -1;
                                                coff = 0;
                                        }

                                        if (fast.IsLazy) {
                                                if (!fast.IsMinimum && !Eval 
(Mode.Count, ref ptr, fast.Expression)) {
                                                        //Console.WriteLine 
("lazy fast: failed mininum.");
                                                        fast = fast.Previous;
                                                        goto Fail;
                                                }
                                                
                                                while (true) {
                                                        int p = ptr + coff;
                                                        if ((c1 < 0 || (p >= 0 
&& p < text_end && (c1 == text[p] || c2 == text[p]))) &&
                                                            Eval (Mode.Match, 
ref ptr, pc))
                                                                break;

                                                        if (fast.IsMaximum) {
                                                                
//Console.WriteLine ("lazy fast: failed with maximum.");
                                                                fast = 
fast.Previous;
                                                                goto Fail;
                                                        }

                                                        Backtrack (cp);
                                                        if (!Eval (Mode.Count, 
ref ptr, fast.Expression)) {
                                                                
//Console.WriteLine ("lazy fast: no more.");
                                                                fast = 
fast.Previous;
                                                                goto Fail;
                                                        }
                                                }
                                                fast = fast.Previous;
                                                goto Pass;
                                        }
                                        else {
                                                if (!Eval (Mode.Count, ref ptr, 
fast.Expression)) {
                                                        fast = fast.Previous;
                                                        goto Fail;
                                                }
                                        
                                                int width;
                                                if (fast.Count > 0)
                                                        width = (ptr - 
fast.Start) / fast.Count;
                                                else
                                                        width = 0;

                                                while (true) {
                                                        int p = ptr + coff;
                                                        if ((c1 < 0 || (p >= 0 
&& p < text_end && (c1 == text[p] || c2 == text[p]))) &&
                                                            Eval (Mode.Match, 
ref ptr, pc))
                                                                break;

                                                        -- fast.Count;
                                                        if (!fast.IsMinimum) {
                                                                fast = 
fast.Previous;
                                                                goto Fail;
                                                        }

                                                        ptr -= width;
                                                        Backtrack (cp);
                                                }
                                                fast = fast.Previous;
                                                goto Pass;
                                        }
                                }

                                case OpCode.Info: {
                                        Debug.Assert (false, "Regex", "Info 
block found in pattern");
                                        goto Fail;
                                }
                                }
                        }
                Pass:
                        ref_ptr = ptr;

                        switch (mode) {
                        case Mode.Match:
                                return true;

                        case Mode.Count: {
                                ++ fast.Count;
                                if (fast.IsMaximum || (fast.IsLazy && 
fast.IsMinimum))
                                        return true;

                                pc = fast.Expression;
                                goto Begin;
                        }
                        }

                Fail:
                        switch (mode) {
                        case Mode.Match:
                                return false;

                        case Mode.Count: {
                                if (!fast.IsLazy && fast.IsMinimum)
                                        return true;

                                ref_ptr = fast.Start;
                                return false;
                        }
                        }

                        return false;
                }

                private bool EvalChar (Mode mode, ref int ptr, ref int pc, bool 
multi) {
                        bool consumed = false;
                        char c = '\0';
                        bool negate;
                        bool ignore;
                        do {
                                ushort word = program[pc];
                                OpCode op = (OpCode)(word & 0x00ff);
                                OpFlags flags = (OpFlags)(word & 0xff00);

                                ++ pc;

                                ignore = (flags & OpFlags.IgnoreCase) != 0;
                                
                                // consume character: the direction of an In 
construct is
                                // determined by the direction of its first op

                                if (!consumed) {
                                        if ((flags & OpFlags.RightToLeft) != 0) 
{
                                                if (ptr <= 0)
                                                        return false;

                                                c = text[-- ptr];
                                        }
                                        else {
                                                if (ptr >= text_end)
                                                        return false;

                                                c = text[ptr ++];
                                        }

                                        if (ignore)
                                                c = Char.ToLower (c);

                                        consumed = true;
                                }

                                // negate flag

                                negate = (flags & OpFlags.Negate) != 0;

                                // execute op
                                
                                switch (op) {
                                case OpCode.True:
                                        return true;

                                case OpCode.False:
                                        return false;
                                
                                case OpCode.Character: {
                                        if (c == (char)program[pc ++])
                                                return !negate;
                                        break;
                                }

                                case OpCode.Category: {
                                        if (CategoryUtils.IsCategory 
((Category)program[pc ++], c))
                                                return !negate;

                                        break;
                                }
                                
                                case OpCode.Range: {
                                        int lo = (char)program[pc ++];
                                        int hi = (char)program[pc ++];
                                        if (lo <= c && c <= hi)
                                                return !negate;
                                        break;
                                }

                                case OpCode.Set: {
                                        int lo = (char)program[pc ++];
                                        int len = (char)program[pc ++];
                                        int bits = pc;
                                        pc += len;

                                        int i = (int)c - lo;
                                        if (i < 0 || i >= len << 4)
                                                break;

                                        if ((program[bits + (i >> 4)] & (1 << 
(i & 0xf))) != 0)
                                                return !negate;
                                        break;
                                }
                                }
                        } while (multi);

                        return negate;
                }

                private bool TryMatch (ref int ref_ptr, int pc) {
                        Reset ();
                        
                        int ptr = ref_ptr;
                        marks [groups [0]].Start = ptr;
                        if (Eval (Mode.Match, ref ptr, pc)) {
                                marks [groups [0]].End = ptr;
                                ref_ptr = ptr;
                                return true;
                        }

                        return false;
                }
                
                private bool IsPosition (Position pos, int ptr) {
                        switch (pos) {
                        case Position.Start: case Position.StartOfString:
                                return ptr == 0;

                        case Position.StartOfLine:
                                return ptr == 0 || text[ptr - 1] == '\n';
                                
                        case Position.StartOfScan:
                                return ptr == scan_ptr;
                        
                        case Position.End:
                                return ptr == text_end ||
                                        (ptr == text_end - 1 && text[ptr] == 
'\n');

                        case Position.EndOfLine:
                                return ptr == text_end || text[ptr] == '\n';
                                
                        case Position.EndOfString:
                                return ptr == text_end;
                                
                        case Position.Boundary:
                                if (text_end == 0)
                                        return false;

                                if (ptr == 0)
                                        return IsWordChar (text[ptr]);
                                else if (ptr == text_end)
                                        return IsWordChar (text[ptr - 1]);
                                else
                                        return IsWordChar (text[ptr]) != 
IsWordChar (text[ptr - 1]);

                        case Position.NonBoundary:
                                if (text_end == 0)
                                        return false;

                                if (ptr == 0)
                                        return !IsWordChar (text[ptr]);
                                else if (ptr == text_end)
                                        return !IsWordChar (text[ptr - 1]);
                                else
                                        return IsWordChar (text[ptr]) == 
IsWordChar (text[ptr - 1]);
                        
                        default:
                                return false;
                        }
                }

                private bool IsWordChar (char c) {
                        return CategoryUtils.IsCategory (Category.Word, c);
                }

                private string GetString (int pc) {
                        int len = program[pc + 1];
                        int str = pc + 2;

                        char[] cs = new char[len];
                        for (int i = 0; i < len; ++ i)
                                cs[i] = (char)program[str ++];

                        return new string (cs);
                }

                // capture management

                private void Open (int gid, int ptr) {
                        int m = groups [gid];
                        if (m < mark_start || marks [m].IsDefined) {
                                m = CreateMark (m);
                                groups [gid] = m;
                        }

                        marks [m].Start = ptr;
                }

                private void Close (int gid, int ptr) {
                        marks [groups [gid]].End = ptr;
                }

                private void Balance (int gid, int balance_gid, int ptr) {
                        int b = groups [balance_gid];
                        Debug.Assert (marks [b].IsDefined, "Regex", "Balancing 
group not closed");

                        if (gid > 0) {
                                Open (gid, marks [b].Index + marks [b].Length);
                                Close (gid, ptr);
                        }

                        groups [balance_gid] = marks [b].Previous;
                }

                private int Checkpoint () {
                        mark_start = mark_end;
                        return mark_start;
                }

                private void Backtrack (int cp) {
                        Debug.Assert (cp > mark_start, "Regex", "Attempt to 
backtrack forwards");

                        for (int i = 0; i < groups.Length; ++ i) {
                                int m = groups [i];
                                while (cp <= m)
                                        m = marks [m].Previous;

                                groups [i] = m;
                        }
                }

                private void ResetGroups () {
                        int n = groups.Length;
                        if (marks == null)
                                marks = new Mark [n * 10];

                        for (int i = 0; i < n; ++ i) {
                                groups [i] = i;

                                marks [i].Start = -1;
                                marks [i].End = -1;
                                marks [i].Previous = -1;
                        }
                        
                        mark_start = 0;
                        mark_end = n;
                }

                private int GetLastDefined (int gid) {
                        int m = groups [gid];
                        while (m >= 0 && !marks [m].IsDefined)
                                m = marks [m].Previous;

                        return m;
                }

                private int CreateMark (int previous) {
                        if (mark_end == marks.Length) {
                                Mark [] dest = new Mark [marks.Length * 2];
                                marks.CopyTo (dest, 0);
                                marks = dest;
                        }

                        int m = mark_end ++;
                        marks [m].Start = marks [m].End = -1;
                        marks [m].Previous = previous;

                        return m;
                }

                private Match GenerateMatch (Regex regex) {
                        int[][] grps = new int[groups.Length][];
                        ArrayList caps = new ArrayList ();

                        for (int gid = 0; gid < groups.Length; ++ gid) {
                                caps.Clear ();
                                for (int m = groups[gid]; m >= 0; m = 
marks[m].Previous) {
                                        if (!marks[m].IsDefined)
                                                continue;
                                        
                                        caps.Add (marks[m].Index);
                                        caps.Add (marks[m].Length);
                                }

                                grps[gid] = (int[])caps.ToArray (typeof (int));
                        }

                        return new Match (regex, this, text, text_end, grps);
                }

                // interpreter attributes

                private ushort[] program;               // regex program
                private int program_start;              // first instruction 
after info block
                private string text;                    // input text
                private int text_end;                   // end of input text 
(last character + 1)
                private int group_count;                // number of capturing 
groups
                private int match_min, match_max;       // match width 
information
                private QuickSearch qs;                 // fast substring 
matcher

                // match state
                
                private int scan_ptr;                   // start of scan

                private RepeatContext repeat;           // current repeat 
context
                private RepeatContext fast;             // fast repeat context

                private Mark[] marks = null;            // mark stack
                private int mark_start;                 // start of current 
checkpoint
                private int mark_end;                   // end of 
checkpoint/next free mark

                private int[] groups;                   // current group 
definitions

                // private classes

                private struct Mark {
                        public int Start, End;
                        public int Previous;

                        public bool IsDefined {
                                get { return Start >= 0 && End >= 0; }
                        }

                        public int Index {
                                get { return Start < End ? Start : End; }
                        }

                        public int Length {
                                get { return Start < End ? End - Start : Start 
- End; }
                        }
                }

                private class RepeatContext {
                        public RepeatContext (RepeatContext previous, int min, 
int max, bool lazy, int expr_pc) {
                                this.previous = previous;
                                this.min = min;
                                this.max = max;
                                this.lazy = lazy;
                                this.expr_pc = expr_pc;
                                
                                this.start = -1;
                                this.count = 0;
                        }

                        public int Count {
                                get { return count; }
                                set { count = value; }
                        }

                        public int Start {
                                get { return start; }
                                set { start = value; }
                        }

                        public bool IsMinimum {
                                get { return min <= count; }
                        }

                        public bool IsMaximum {
                                get { return max <= count; }
                        }

                        public bool IsLazy {
                                get { return lazy; }
                        }

                        public int Expression {
                                get { return expr_pc; }
                        }

                        public RepeatContext Previous {
                                get { return previous; }
                        }
                
                        private int start;
                        private int min, max;
                        private bool lazy;
                        private int expr_pc;
                        private RepeatContext previous;

                        private int count;
                }

                private enum Mode {
                        Search,
                        Match,
                        Count
                }
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        interval.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Collections;

namespace System.Text.RegularExpressions {

        struct Interval : IComparable {
                public int low;
                public int high;
                public bool contiguous;

                public static Interval Empty {
                        get {
                                Interval i;
                                i.low = 0;
                                i.high = i.low - 1;
                                i.contiguous = true;

                                return i;
                        }
                }

                public static Interval Entire {
                        get { return new Interval (Int32.MinValue, 
Int32.MaxValue); }
                }

                public Interval (int low, int high) {
                        if (low > high) {
                                int t = low;
                                low = high;
                                high = t;
                        }
                
                        this.low = low;
                        this.high = high;
                        this.contiguous = true;
                }

                public bool IsDiscontiguous {
                        get { return !contiguous; }
                }
                
                public bool IsSingleton {
                        get { return contiguous && low == high; }
                }

                public bool IsRange {
                        get { return !IsSingleton && !IsEmpty; }
                }

                public bool IsEmpty {
                        get { return low > high; }
                }

                public int Size {
                        get {
                                if (IsEmpty)
                                        return 0;
                                
                                return high - low + 1;
                        }
                }

                public bool IsDisjoint (Interval i) {
                        if (IsEmpty || i.IsEmpty)
                                return true;
                        
                        return !(low <= i.high && i.low <= high);
                }

                public bool IsAdjacent (Interval i) {
                        if (IsEmpty || i.IsEmpty)
                                return false;
                
                        return low == i.high + 1 || high == i.low - 1;
                }

                public bool Contains (Interval i) {
                        if (!IsEmpty && i.IsEmpty)
                                return true;
                        if (IsEmpty)
                                return false;
                
                        return low <= i.low && i.high <= high;
                }

                public bool Contains (int i) {
                        return low <= i && i <= high;
                }

                public void Merge (Interval i) {
                        if (i.IsEmpty)
                                return;
                        if (IsEmpty) {
                                this.low = i.low;
                                this.high = i.high;
                        }
                
                        if (i.low < low)
                                low = i.low;
                        if (i.high > high)
                                high = i.high;
                }

                public void Intersect (Interval i) {
                        if (IsDisjoint (i)) {
                                low = 0;
                                high = low - 1;
                                return;
                        }
                
                        if (i.low > low)
                                low = i.low;
                        if (i.high > high)
                                high = i.high;
                }

                public int CompareTo (object o) {
                        return low - ((Interval)o).low;
                }

                public override string ToString () {
                        if (IsEmpty)
                                return "(EMPTY)";
                        else if (!contiguous)
                                return "{" + low + ", " + high + "}";
                        else if (IsSingleton)
                                return "(" + low + ")";
                        else
                                return "(" + low + ", " + high + ")";
                }
        }

        class IntervalCollection : ICollection, IEnumerable {
                public IntervalCollection () {
                        intervals = new ArrayList ();
                }

                public Interval this[int i] {
                        get { return (Interval)intervals[i]; }
                        set { intervals[i] = value; }
                }

                public void Add (Interval i) {
                        intervals.Add (i);
                }
                        
                public void Clear () {
                        intervals.Clear ();
                }

                public void Sort () {
                        intervals.Sort ();
                }
                
                public void Normalize () {
                        intervals.Sort ();

                        int j = 0;
                        while (j < intervals.Count - 1) {
                                Interval a = (Interval)intervals[j];
                                Interval b = (Interval)intervals[j + 1];

                                if (!a.IsDisjoint (b) || a.IsAdjacent (b)) {
                                        a.Merge (b);
                                        intervals[j] = a;
                                        intervals.RemoveAt (j + 1);
                                }
                                else
                                        ++ j;
                        }

                }

                public delegate double CostDelegate (Interval i);

                public IntervalCollection GetMetaCollection (CostDelegate 
cost_del) {
                        IntervalCollection meta = new IntervalCollection ();
                
                        Normalize ();
                        Optimize (0, Count - 1, meta, cost_del);
                        meta.intervals.Sort ();

                        return meta;
                }

                private void Optimize (int begin, int end, IntervalCollection 
meta, CostDelegate cost_del) {
                        Interval set;
                        set.contiguous = false;
                
                        int best_set_begin = -1;
                        int best_set_end = -1;
                        double best_set_cost = 0;

                        for (int i = begin; i <= end; ++ i) {
                                set.low = this[i].low;

                                double cost = 0.0;
                                for (int j = i; j <= end; ++ j) {
                                        set.high = this[j].high;
                                        cost += cost_del (this[j]);
                                        
                                        double set_cost = cost_del (set);
                                        if (set_cost < cost && cost > 
best_set_cost) {
                                                best_set_begin = i;
                                                best_set_end = j;
                                                best_set_cost = cost;
                                        }
                                }
                        }

                        if (best_set_begin < 0) {
                                // didn't find an optimal set: add original 
members

                                for (int i = begin; i <= end; ++ i)
                                        meta.Add (this[i]);
                        }
                        else {
                                // found set: add it ...

                                set.low = this[best_set_begin].low;
                                set.high = this[best_set_end].high;
                                
                                meta.Add (set);

                                // ... and optimize to the left and right

                                if (best_set_begin > begin)
                                        Optimize (begin, best_set_begin - 1, 
meta, cost_del);
                                if (best_set_end < end)
                                        Optimize (best_set_end + 1, end, meta, 
cost_del);
                        }
                }

                // ICollection implementation

                public int Count {
                        get { return intervals.Count; }
                }

                public bool IsSynchronized {
                        get { return false; }
                }

                public object SyncRoot {
                        get { return intervals; }
                }

                public void CopyTo (Array array, int index) {
                        foreach (Interval i in intervals) {
                                if (index > array.Length)
                                        break;
                                
                                array.SetValue (i, index ++);
                        }
                }

                // IEnumerator implementation

                public IEnumerator GetEnumerator () {
                        return new Enumerator (intervals);
                }

                private class Enumerator : IEnumerator {
                        public Enumerator (IList list) {
                                this.list = list;
                                Reset ();
                        }

                        public object Current {
                                get {
                                        if (ptr >= list.Count)
                                                throw new 
InvalidOperationException ();

                                        return list[ptr];
                                }
                        }

                        public bool MoveNext () {
                                if (ptr > list.Count)
                                        throw new InvalidOperationException ();
                                
                                return ++ ptr < list.Count;
                        }

                        public void Reset () {
                                ptr = -1;
                        }

                        private IList list;
                        private int ptr;
                }

                // private fields

                private ArrayList intervals;
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        match.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;

namespace System.Text.RegularExpressions {

        [Serializable]
        public class Capture {
                public int Index {
                        get { return index; }
                }

                public int Length {
                        get { return length; }
                }

                public string Value {
                        get { return text.Substring (index, length); }
                }

                public override string ToString () {
                        return Value;
                }

                // internal members

                internal Capture (string text) : this (text, 0, 0) { }

                internal Capture (string text, int index, int length) {
                        this.text = text;
                        this.index = index;
                        this.length = length;
                }
                
                internal string Text {
                        get { return text; }
                }

                // private

                internal int index, length;
                internal string text;
        }

        [Serializable]
        public class Group : Capture {
                public static Group Synchronized (Group inner) {
                        return inner;   // is this enough?
                }

                public CaptureCollection Captures {
                        get { return captures; }
                }

                public bool Success {
                        get { return success; }
                }

                // internal

                internal Group (string text, int[] caps) : base (text) {
                        this.captures = new CaptureCollection ();

                        if (caps == null || caps.Length == 0) {
                                this.success = false;
                                return;
                        }

                        this.success = true;
                        this.index = caps[0];
                        this.length = caps[1];
                        captures.Add (this);
                        for (int i = 2; i < caps.Length; i += 2)
                                captures.Add (new Capture (text, caps[i], 
caps[i + 1]));
                        captures.Reverse ();
                }

                private bool success;
                private CaptureCollection captures;
        }

        [Serializable]
        public class Match : Group {
                public static Match Empty {
                        get { return empty; }
                }
                
                public static Match Synchronized (Match inner) {
                        return inner;   // FIXME need to sync on machine access
                }
                
                public virtual GroupCollection Groups {
                        get { return groups; }
                }

                public Match NextMatch () {
                        if (this == Empty)
                                return Empty;

                        int scan_ptr = regex.RightToLeft ? Index : Index + 
Length;

                        // next match after an empty match: make sure scan ptr 
makes progress
                        
                        if (Length == 0)
                                scan_ptr += regex.RightToLeft ? -1 : +1;

                        return machine.Scan (regex, Text, scan_ptr, 
text_length);
                }

                public virtual string Result (string replacement) {
                        return ReplacementEvaluator.Evaluate (replacement, 
this);
                }

                // internal

                internal Match () : base (null, null) {
                        this.regex = null;
                        this.machine = null;
                        this.text_length = 0;
                        this.groups = new GroupCollection ();

                        groups.Add (this);
                }
                
                internal Match (Regex regex, IMachine machine, string text, int 
text_length, int[][] grps) :
                        base (text, grps[0])
                {
                        this.regex = regex;
                        this.machine = machine;
                        this.text_length = text_length;

                        this.groups = new GroupCollection ();
                        groups.Add (this);
                        for (int i = 1; i < grps.Length; ++ i)
                                groups.Add (new Group (text, grps[i]));
                }

                internal Regex Regex {
                        get { return regex; }
                }

                // private

                private Regex regex;
                private IMachine machine;
                private int text_length;
                private GroupCollection groups;

                private static Match empty = new Match ();
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        parser.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Collections;
using System.Globalization;

namespace System.Text.RegularExpressions.Syntax {

        class Parser {
                public static int ParseDecimal (string str, ref int ptr) {
                        return ParseNumber (str, ref ptr, 10, 1, 
Int32.MaxValue);
                }

[...1081 lines suppressed...]

                private static bool IsECMAScript (RegexOptions options) {
                        return (options & RegexOptions.ECMAScript) != 0;
                }

                // exception creation

                private ArgumentException NewParseException (string msg) {
                        msg = "parsing \"" + pattern + "\" - " + msg;
                        return new ArgumentException (msg, pattern);
                }

                private string pattern;
                private int ptr;

                private ArrayList caps;
                private Hashtable refs;
                private int num_groups;
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        quicksearch.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;

namespace System.Text.RegularExpressions {

        // TODO use simple test for single character strings

        class QuickSearch {
                // simplified boyer-moore for fast substring matching
        
                public QuickSearch (string str, bool ignore) {
                        this.str = str;
                        this.len = str.Length;
                        this.ignore = ignore;
                
                        Setup ();
                }
                
                public string String {
                        get { return str; }
                }

                public int Length {
                        get { return len; }
                }

                public bool IgnoreCase {
                        get { return ignore; }
                }

                public int Search (string text, int start, int end) {
                        if (end > text.Length - len)
                                end = text.Length - len;
                
                        int ptr = start;
                        if (!ignore) {
                                while (ptr <= end) {
                                        int i = len - 1;
                                        while (str[i] == text[ptr + i]) {
                                                if (-- i < 0)
                                                        return ptr;
                                        }

                                        if (ptr < end)
                                                ptr += shift[text[ptr + len]];
                                        else
                                                break;
                                }
                        }
                        else {
                                // ignore case: same as above, but we convert 
text
                                // to lower case before doing the string compare
                        
                                while (ptr <= end) {
                                        int i = len - 1;
                                        while (str[i] == Char.ToLower (text[ptr 
+ i])) {
                                                if (-- i < 0)
                                                        return ptr;
                                        }

                                        if (ptr < end)
                                                ptr += shift[text[ptr + len]];
                                        else
                                                break;
                                }
                        }

                        return -1;
                }

                // private

                private void Setup () {
                        if (ignore)
                                str = str.ToLower ();

                        // this is a 64k entry shift table. that's 128kb per 
pattern!
                        // is it worth compressing this by only storing shifts 
within
                        // a (lo, hi) character range? for most substrings this 
would
                        // be around 50 bytes...

                        shift = new int[0x1000];
                        for (int i = 0; i < 0x1000; ++ i)
                                shift[i] = len + 1;

                        for (int i = 0; i < len; ++ i) {
                                char c = str[i];

                                shift[c] = len - i;
                                if (ignore)
                                        shift[Char.ToUpper (c)] = len - i;
                        }
                }

                private string str;
                private int len;
                private bool ignore;

                private int[] shift;
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        regex.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Text;
using System.Collections;
using System.Reflection;
using System.Reflection.Emit;
using System.Runtime.Serialization;

using RegularExpression = 
System.Text.RegularExpressions.Syntax.RegularExpression;
using Parser = System.Text.RegularExpressions.Syntax.Parser;

namespace System.Text.RegularExpressions {
        
        public delegate string MatchEvaluator (Match match);

        [Flags]
        public enum RegexOptions {
                None                            = 0x000,
                IgnoreCase                      = 0x001,
                Multiline                       = 0x002,
                ExplicitCapture                 = 0x004,
                Compiled                        = 0x008,
                Singleline                      = 0x010,
                IgnorePatternWhitespace         = 0x020,
                RightToLeft                     = 0x040,
                ECMAScript                      = 0x100
        }
        
        [Serializable]
        public class Regex : ISerializable {
                public static void CompileToAssembly
                        (RegexCompilationInfo[] regexes, AssemblyName aname)
                {
                        throw new Exception ("Not implemented.");
                }

                public static void CompileToAssembly
                        (RegexCompilationInfo[] regexes, AssemblyName aname,
                         CustomAttributeBuilder[] attribs)
                {
                        throw new Exception ("Not implemented.");
                }

                public static void CompileToAssembly
                        (RegexCompilationInfo[] regexes, AssemblyName aname,
                         CustomAttributeBuilder[] attribs, string resourceFile)
                {
                        throw new Exception ("Not implemented.");
                }
                
                public static string Escape (string str) {
                        return Parser.Escape (str);
                }

                public static string Unescape (string str) {
                        return Parser.Unescape (str);
                }

                public static bool IsMatch (string input, string pattern) {
                        return IsMatch (input, pattern, RegexOptions.None);
                }

                public static bool IsMatch (string input, string pattern, 
RegexOptions options) {
                        Regex re = new Regex (pattern, options);
                        return re.IsMatch (input);
                }

                public static Match Match (string input, string pattern) {
                        return Regex.Match (input, pattern, RegexOptions.None);
                }

                public static Match Match (string input, string pattern, 
RegexOptions options) {
                        Regex re = new Regex (pattern, options);
                        return re.Match (input);
                }

                public static MatchCollection Matches (string input, string 
pattern) {
                        return Matches (input, pattern, RegexOptions.None);
                }

                public static MatchCollection Matches (string input, string 
pattern, RegexOptions options) {
                        Regex re = new Regex (pattern, options);
                        return re.Matches (input);
                }

                public static string Replace
                        (string input, string pattern, MatchEvaluator evaluator)
                {
                        return Regex.Replace (input, pattern, evaluator, 
RegexOptions.None);
                }

                public static string Replace
                        (string input, string pattern, MatchEvaluator evaluator,
                         RegexOptions options)
                {
                        Regex re = new Regex (pattern, options);
                        return re.Replace (input, evaluator);
                }

                public static string Replace
                        (string input, string pattern, string replacement)
                {
                        return Regex.Replace (input, pattern, replacement, 
RegexOptions.None);
                }

                public static string Replace
                        (string input, string pattern, string replacement,
                         RegexOptions options)
                {
                        Regex re = new Regex (pattern, options);
                        return re.Replace (input, replacement);
                }

                public static string[] Split (string input, string pattern) {
                        return Regex.Split (input, pattern, RegexOptions.None);
                }

                public static string[] Split (string input, string pattern, 
RegexOptions options) {
                        Regex re = new Regex (input, options);
                        return re.Split (input);
                }

                // private

                private static FactoryCache cache = new FactoryCache (200);     
// TODO put some meaningful number here

                // constructors

                protected Regex () {
                        // XXX what's this constructor for?
                }

                public Regex (string pattern) : this (pattern, 
RegexOptions.None) {
                }

                public Regex (string pattern, RegexOptions options) {
                        this.pattern = pattern;
                        this.options = options;
                
                        this.factory = cache.Lookup (pattern, options);

                        if (this.factory == null) {
                                // parse and install group mapping

                                Parser psr = new Parser ();
                                RegularExpression re = 
psr.ParseRegularExpression (pattern, options);
                                this.group_count = re.GroupCount;
                                this.mapping = psr.GetMapping ();

                                // compile
                                
                                ICompiler cmp;
                                //if ((options & RegexOptions.Compiled) != 0)
                                //      throw new Exception ("Not 
implemented.");
                                        //cmp = new CILCompiler ();
                                //else
                                        cmp = new PatternCompiler ();

                                re.Compile (cmp, RightToLeft);

                                // install machine factory and add to pattern 
cache

                                this.factory = cmp.GetMachineFactory ();
                                this.factory.Mapping = mapping;
                                cache.Add (pattern, options, this.factory);
                        } else {
                                this.group_count = this.factory.GroupCount;
                                this.mapping = this.factory.Mapping;
                        }
                }

                protected Regex (SerializationInfo info, StreamingContext 
context) :
                        this (info.GetString ("pattern"), 
                              (RegexOptions) info.GetValue ("options", typeof 
(RegexOptions))) {                        
                }


                // public instance properties
                
                public RegexOptions Options {
                        get { return options; }
                }

                public bool RightToLeft {
                        get { return (options & RegexOptions.RightToLeft) != 0; 
}
                }

                // public instance methods
                
                public string[] GetGroupNames () {
                        string[] names = new string[mapping.Count];
                        mapping.Keys.CopyTo (names, 0);

                        return names;
                }

                public int[] GetGroupNumbers () {
                        int[] numbers = new int[mapping.Count];
                        mapping.Values.CopyTo (numbers, 0);

                        return numbers;
                }

                public string GroupNameFromNumber (int i) {
                        if (i >= group_count)
                                return "";
                
                        foreach (string name in mapping.Keys) {
                                if ((int)mapping[name] == i)
                                        return name;
                        }

                        return "";
                }

                public int GroupNumberFromName (string name) {
                        if (mapping.Contains (name))
                                return (int)mapping[name];

                        return -1;
                }

                // match methods
                
                public bool IsMatch (string input) {
                        return IsMatch (input, 0);
                }

                public bool IsMatch (string input, int startat) {
                        return Match (input, startat).Success;
                }

                public Match Match (string input) {
                        return Match (input, 0);
                }

                public Match Match (string input, int startat) {
                        return CreateMachine ().Scan (this, input, startat, 
input.Length);
                }

                public Match Match (string input, int startat, int length) {
                        return CreateMachine ().Scan (this, input, startat, 
startat + length);
                }

                public MatchCollection Matches (string input) {
                        return Matches (input, 0);
                }

                public MatchCollection Matches (string input, int startat) {
                        MatchCollection ms = new MatchCollection ();
                        Match m = Match (input, startat);
                        while (m.Success) {
                                ms.Add (m);
                                m = m.NextMatch ();
                        }

                        return ms;
                }

                // replace methods

                public string Replace (string input, MatchEvaluator evaluator) {
                        return Replace (input, evaluator, Int32.MaxValue, 0);
                }

                public string Replace (string input, MatchEvaluator evaluator, 
int count) {
                        return Replace (input, evaluator, count, 0);
                }

                public string Replace (string input, MatchEvaluator evaluator, 
int count, int startat)
                {
                        StringBuilder result = new StringBuilder ();
                        int ptr = startat;

                        Match m = Match (input, startat);
                        while (m.Success && count -- > 0) {
                                result.Append (input.Substring (ptr, m.Index - 
ptr));
                                result.Append (evaluator (m));

                                ptr = m.Index + m.Length;
                                m = m.NextMatch ();
                        }
                        result.Append (input.Substring (ptr));

                        return result.ToString ();
                }

                public string Replace (string input, string replacement) {
                        return Replace (input, replacement, Int32.MaxValue, 0);
                }

                public string Replace (string input, string replacement, int 
count) {
                        return Replace (input, replacement, count, 0);
                }

                public string Replace (string input, string replacement, int 
count, int startat) {
                        ReplacementEvaluator ev = new ReplacementEvaluator 
(this, replacement);
                        return Replace (input, new MatchEvaluator 
(ev.Evaluate), count, startat);
                }

                // split methods

                public string[] Split (string input) {
                        return Split (input, Int32.MaxValue, 0);
                }

                public string[] Split (string input, int count) {
                        return Split (input, count, 0);
                }

                public string[] Split (string input, int count, int startat) {
                        ArrayList splits = new ArrayList ();
                        if (count == 0)
                                count = Int32.MaxValue;

                        int ptr = startat;
                        while (count -- > 0) {
                                Match m = Match (input, ptr);
                                if (!m.Success)
                                        break;
                        
                                splits.Add (input.Substring (ptr, m.Index - 
ptr));
                                ptr = m.Index + m.Length;
                        }

                        if (count > 0)
                                splits.Add (input.Substring (ptr));

                        string[] result = new string[splits.Count];
                        splits.CopyTo (result);
                        return result;
                }

                // object methods
                
                public override string ToString () {
                        return pattern;
                }

                // ISerializable interface
                public virtual void GetObjectData (SerializationInfo info, 
StreamingContext context) {
                        info.AddValue ("pattern", this.ToString (), typeof 
(string));
                        info.AddValue ("options", this.Options, typeof 
(RegexOptions));
                }

                // internal

                internal int GroupCount {
                        get { return group_count; }
                }

                // private

                private IMachine CreateMachine () {
                        return factory.NewInstance ();
                }

                protected internal string pattern;
                private RegexOptions options;

                private IMachineFactory factory;
                private IDictionary mapping;
                private int group_count;
        }

        [Serializable]
        public class RegexCompilationInfo {
                public RegexCompilationInfo (string pattern, RegexOptions 
options, string name, string full_namespace, bool is_public) {
                        this.pattern = pattern;
                        this.options = options;
                        this.name = name;
                        this.full_namespace = full_namespace;
                        this.is_public = is_public;
                }

                public bool IsPublic {
                        get { return is_public; }
                        set { is_public = value; }
                }

                public string Name {
                        get { return name; }
                        set { name = value; }
                }

                public string Namespace {
                        get { return full_namespace; }
                        set { full_namespace = value; }
                }

                public RegexOptions Options {
                        get { return options; }
                        set { options = value; }
                }

                public string Pattern {
                        get { return pattern; }
                        set { pattern = value; }
                }

                // private

                private string pattern, name, full_namespace;
                private RegexOptions options;
                private bool is_public;
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        RegexRunner.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;

namespace System.Text.RegularExpressions {
        
        public abstract class RegexRunner {
                // constructor
        
                protected internal RegexRunner () {
                        throw new NotImplementedException ("RegexRunner is not 
supported by this library");
                }

                // protected abstract

                protected abstract bool FindFirstChar ();

                protected abstract void Go ();

                protected abstract void InitTrackCount ();

                // protected methods

                protected void Capture (int capnum, int start, int end) {
                }

                protected static bool CharInSet (char ch, string set, string 
category) {
                        return false;
                }

                protected void Crawl (int i) {
                }

                protected int CrawlPos () {
                        return 0;
                }

                protected void DoubleCrawl () {
                }

                protected void DoubleStack () {
                }

                protected void DoubleTrack () {
                }

                protected void EnsureStorage () {
                }

                protected bool IsBoundary (int index, int startpos, int endpos) 
{
                        return false;
                }

                protected bool IsECMABoundary (int index, int startpos, int 
endpos) {
                        return false;
                }

                protected bool IsMatched (int cap) {
                        return false;
                }

                protected int MatchIndex (int cap) {
                        return 0;
                }

                protected int MatchLength (int cap) {
                        return 0;
                }

                protected int PopCrawl () {
                        return 0;
                }

                protected void TransferCapture (int capnum, int uncapnum, int 
start, int end) {
                }

                protected void Uncapture () {
                }

                // internal
                
                protected internal Match Scan (Regex regex, string text, int 
textbeg, int textend, int textstart, int prevlen, bool quick) {
                        return null;
                }
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        RegexRunnerFactory.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;

namespace System.Text.RegularExpressions {
        
        public abstract class RegexRunnerFactory {
                protected RegexRunnerFactory () {
                        throw new NotImplementedException ("RegexRunnerFactory 
is not supported by this library.");
                }

                protected internal abstract RegexRunner CreateInstance ();
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        replace.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Text;
using System.Collections;

using Parser = System.Text.RegularExpressions.Syntax.Parser;

namespace System.Text.RegularExpressions {

        class ReplacementEvaluator {
                public static string Evaluate (string replacement, Match match) 
{
                        ReplacementEvaluator ev = new ReplacementEvaluator 
(match.Regex, replacement);
                        return ev.Evaluate (match);
                }

                public ReplacementEvaluator (Regex regex, string replacement) {
                        this.regex = regex;
                        terms = new ArrayList ();
                        Compile (replacement);
                }

                public string Evaluate (Match match) {
                        StringBuilder result = new StringBuilder ();
                        foreach (Term term in terms)
                                result.Append (term.GetResult (match));

                        return result.ToString ();
                }

                // private

                private void Compile (string replacement) {
                        replacement = Parser.Unescape (replacement);
                        string literal = "";

                        int ptr = 0;
                        char c;
                        Term term = null;
                        while (ptr < replacement.Length) {
                                c = replacement[ptr ++];

                                if (c == '$') {
                                        if (replacement[ptr] == '$') {
                                                ++ ptr;
                                                break;
                                        }

                                        term = CompileTerm (replacement, ref 
ptr);
                                }

                                if (term != null) {
                                        term.Literal = literal;
                                        terms.Add (term);

                                        term = null;
                                        literal = "";
                                }
                                else
                                        literal += c;
                        }

                        if (term == null && literal.Length > 0) {
                                terms.Add (new Term (literal));
                        }
                }

                private Term CompileTerm (string str, ref int ptr) {
                        char c = str[ptr];

                        if (Char.IsDigit (c)) {         // numbered group
                                int n = Parser.ParseDecimal (str, ref ptr);
                                if (n < 0 || n > regex.GroupCount)
                                        throw new ArgumentException ("Bad group 
number.");
                                
                                return new Term (TermOp.Match, n);
                        }
                        
                        ++ ptr;

                        switch (c) {
                        case '{': {                     // named group
                                string name = Parser.ParseName (str, ref ptr);
                                if (str[ptr ++] != '}' || name == null)
                                        throw new ArgumentException ("Bad group 
name.");
                                
                                int n = regex.GroupNumberFromName (name);
                                
                                if (n < 0)
                                        throw new ArgumentException ("Bad group 
name.");

                                return new Term (TermOp.Match, n);
                        }

                        case '&':                       // entire match
                                return new Term (TermOp.Match, 0);

                        case '`':                       // text before match
                                return new Term (TermOp.PreMatch, 0);

                        case '\'':                      // text after match
                                return new Term (TermOp.PostMatch, 0);

                        case '+':                       // last group
                                return new Term (TermOp.Match, regex.GroupCount 
- 1);

                        case '_':                       // entire text
                                return new Term (TermOp.All, 0);

                        default:
                                throw new ArgumentException ("Bad replacement 
pattern.");
                        }
                }

                private Regex regex;
                private ArrayList terms;

                private enum TermOp {
                        None,                           // no action
                        Match,                          // input within group
                        PreMatch,                       // input before group
                        PostMatch,                      // input after group
                        All                             // entire input
                }

                private class Term {
                        public Term (TermOp op, int arg) {
                                this.op = op;
                                this.arg = arg;
                                this.literal = "";
                        }

                        public Term (string literal) {
                                this.op = TermOp.None;
                                this.arg = 0;
                                this.literal = literal;
                        }

                        public string Literal {
                                set { literal = value; }
                        }

                        public string GetResult (Match match) {
                                Group group = match.Groups[arg];
                        
                                switch (op) {
                                case TermOp.None:
                                        return literal;

                                case TermOp.Match:
                                        return literal + group.Value;

                                case TermOp.PreMatch:
                                        return literal + group.Text.Substring 
(0, group.Index);

                                case TermOp.PostMatch:
                                        return literal + group.Text.Substring 
(group.Index + group.Length);

                                case TermOp.All:
                                        return literal + group.Text;
                                }

                                return "";
                        }
                
                        public TermOp op;               // term type
                        public int arg;                 // group argument
                        public string literal;          // literal to prepend

                        public override string ToString () {
                                return op.ToString () + "(" + arg + ") " + 
literal;
                        }
                }
        }
}

--- NEW FILE ---
//
// assembly:    System
// namespace:   System.Text.RegularExpressions
// file:        syntax.cs
//
// author:      Dan Lewis (address@hidden)
//              (c) 2002

using System;
using System.Collections;

namespace System.Text.RegularExpressions.Syntax {
        // collection classes
        
        class ExpressionCollection : CollectionBase {
                public void Add (Expression e) {
                        List.Add (e);
                }

                public Expression this[int i] {
                        get { return (Expression)List[i]; }
                        set { List[i] = value; }
                }

                protected override void OnValidate (object o) {
                        // allow null elements
                }
        }

        // abstract classes
        
        abstract class Expression {
                public abstract void Compile (ICompiler cmp, bool reverse);
                public abstract void GetWidth (out int min, out int max);

                public int GetFixedWidth () {
                        int min, max;
                        GetWidth (out min, out max);

                        if (min == max)
                                return min;

                        return -1;
                }

                public virtual AnchorInfo GetAnchorInfo () {
                        return new AnchorInfo (this, GetFixedWidth ());
                }

                public virtual bool IsComplex () {
                        return true;
                }
        }

        // composite expressions
        
        abstract class CompositeExpression : Expression {
                public CompositeExpression () {
                        expressions = new ExpressionCollection ();
                }

                protected ExpressionCollection Expressions {
                        get { return expressions; }
                }

                protected void GetWidth (out int min, out int max, int count) {
                        min = Int32.MaxValue;
                        max = 0;
                        bool empty = true;

                        for (int i = 0; i < count; ++ i) {
                                Expression e = Expressions[i];
                                if (e == null)
                                        continue;
                        
                                empty = false;
                                int a, b;
                                e.GetWidth (out a, out b);
                                if (a < min) min = a;
                                if (b > max) max = b;
                        }

                        if (empty)
                                min = max = 0;
                }

                private ExpressionCollection expressions;
        }

        // groups
        
        class Group : CompositeExpression {
                public Group () {
                }

                public Expression Expression {
                        get { return Expressions[0]; }
                        set { Expressions[0] = value; }
                }

                public void AppendExpression (Expression e) {
                        Expressions.Add (e);
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        int count = Expressions.Count;
                        for (int i = 0; i < count; ++ i) {
                                Expression e;
                                if (reverse)
                                        e = Expressions[count - i - 1];
                                else
                                        e = Expressions[i];

                                e.Compile (cmp, reverse);
                        }
                }

                public override void GetWidth (out int min, out int max) {
                        min = 0;
                        max = 0;

                        foreach (Expression e in Expressions) {
                                int a, b;
                                e.GetWidth (out a, out b);
                                min += a;
                                if (max == Int32.MaxValue || b == 
Int32.MaxValue)
                                        max = Int32.MaxValue;
                                else
                                        max += b;
                        }
                }

                public override AnchorInfo GetAnchorInfo () {
                        int ptr;
                        int width = GetFixedWidth ();

                        ArrayList infos = new ArrayList ();
                        IntervalCollection segments = new IntervalCollection ();

                        // accumulate segments

                        ptr = 0;
                        foreach (Expression e in Expressions) {
                                AnchorInfo info = e.GetAnchorInfo ();
                                infos.Add (info);

                                if (info.IsPosition)
                                        return new AnchorInfo (this, ptr + 
info.Offset, width, info.Position);

                                if (info.IsSubstring)
                                        segments.Add (info.GetInterval (ptr));

                                if (info.IsUnknownWidth)
                                        break;

                                ptr += info.Width;
                        }

                        // normalize and find the longest segment

                        segments.Normalize ();

                        Interval longest = Interval.Empty;
                        foreach (Interval segment in segments) {
                                if (segment.Size > longest.Size)
                                        longest = segment;
                        }

                        // now chain the substrings that made this segment 
together

                        if (!longest.IsEmpty) {
                                string str = "";
                                bool ignore = false;

                                ptr = 0;
                                foreach (AnchorInfo info in infos) {
                                        if (info.IsSubstring && 
longest.Contains (info.GetInterval (ptr))) {
                                                str += info.Substring;  // TODO 
mark subexpressions
                                                ignore |= info.IgnoreCase;
                                        }

                                        if (info.IsUnknownWidth)
                                                break;

                                        ptr += info.Width;
                                }

                                return new AnchorInfo (this, longest.low, 
width, str, ignore);
                        }

                        return new AnchorInfo (this, width);
                }

                public override bool IsComplex () {
                        bool comp = false;
                        foreach (Expression e in Expressions) {
                                comp |= e.IsComplex ();
                        }

                        return comp | GetFixedWidth () <= 0;
                }
        }

        class RegularExpression : Group {
                public RegularExpression () {
                        group_count = 0;
                }

                public int GroupCount {
                        get { return group_count; }
                        set { group_count = value; }
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        // info block

                        int min, max;
                        GetWidth (out min, out max);
                        cmp.EmitInfo (group_count, min, max);

                        // anchoring expression

                        AnchorInfo info = GetAnchorInfo ();
                        if (reverse)
                                info = new AnchorInfo (this, GetFixedWidth ()); 
// FIXME

                        LinkRef pattern = cmp.NewLink ();
                        cmp.EmitAnchor (info.Offset, pattern);

                        if (info.IsPosition)
                                cmp.EmitPosition (info.Position);
                        else if (info.IsSubstring)
                                cmp.EmitString (info.Substring, 
info.IgnoreCase, reverse);
                        
                        cmp.EmitTrue ();
                        
                        // pattern

                        cmp.ResolveLink (pattern);
                        base.Compile (cmp, reverse);
                        cmp.EmitTrue ();
                }

                private int group_count;
        }

        class CapturingGroup : Group {
                public CapturingGroup () {
                        this.gid = 0;
                        this.name = null;
                }

                public int Number {
                        get { return gid; }
                        set { gid = value; }
                }

                public string Name {
                        get { return name; }
                        set { name = value; }
                }

                public bool IsNamed {
                        get { return name != null; }
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        cmp.EmitOpen (gid);
                        base.Compile (cmp, reverse);
                        cmp.EmitClose (gid);
                }

                public override bool IsComplex () {
                        return true;
                }

                private int gid;
                private string name;
        }

        class BalancingGroup : CapturingGroup {
                public BalancingGroup () {
                        this.balance = null;
                }

                public CapturingGroup Balance {
                        get { return balance; }
                        set { balance = value; }
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        // can't invoke Group.Compile from here :(
                        // so I'll just repeat the code
                
                        int count = Expressions.Count;
                        for (int i = 0; i < count; ++ i) {
                                Expression e;
                                if (reverse)
                                        e = Expressions[count - i - 1];
                                else
                                        e = Expressions[i];

                                e.Compile (cmp, reverse);
                        }

                        cmp.EmitBalance (this.Number, balance.Number);
                }

                private CapturingGroup balance;
        }

        class NonBacktrackingGroup : Group {
                public NonBacktrackingGroup () {
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        LinkRef tail = cmp.NewLink ();

                        cmp.EmitSub (tail);
                        base.Compile (cmp, reverse);
                        cmp.EmitTrue ();
                        cmp.ResolveLink (tail);
                }

                public override bool IsComplex () {
                        return true;
                }
        }

        // repetition

        class Repetition : CompositeExpression {
                public Repetition (int min, int max, bool lazy) {
                        Expressions.Add (null);
                        
                        this.min = min;
                        this.max = max;
                        this.lazy = lazy;
                }

                public Expression Expression {
                        get { return Expressions[0]; }
                        set { Expressions[0] = value; }
                }

                public int Minimum {
                        get { return min; }
                        set { min = value; }
                }

                public int Maximum {
                        get { return max; }
                        set { max = value; }
                }

                public bool Lazy {
                        get { return lazy; }
                        set { lazy = value; }
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        if (Expression.IsComplex ()) {
                                LinkRef until = cmp.NewLink ();
                                
                                cmp.EmitRepeat (min, max, lazy, until);
                                Expression.Compile (cmp, reverse);
                                cmp.EmitUntil (until);
                        }
                        else {
                                LinkRef tail = cmp.NewLink ();

                                cmp.EmitFastRepeat (min, max, lazy, tail);
                                Expression.Compile (cmp, reverse);
                                cmp.EmitTrue ();
                                cmp.ResolveLink (tail);
                        }
                }

                public override void GetWidth (out int min, out int max) {
                        Expression.GetWidth (out min, out max);
                        min = min * this.min;
                        if (max == Int32.MaxValue || this.max == 0xffff)
                                max = Int32.MaxValue;
                        else
                                max = max * this.max;
                }

                public override AnchorInfo GetAnchorInfo () {
                        int width = GetFixedWidth ();
                        if (Minimum == 0)
                                return new AnchorInfo (this, width);
                        
                        AnchorInfo info = Expression.GetAnchorInfo ();
                        if (info.IsPosition)
                                return new AnchorInfo (this, info.Offset, 
width, info.Position);
                        
                        if (info.IsSubstring) {
                                if (info.IsComplete) {
                                        string str = "";
                                        for (int i = 0; i < Minimum; ++ i)
                                                str += info.Substring;

                                        return new AnchorInfo (this, 0, width, 
str, info.IgnoreCase);
                                }

                                return new AnchorInfo (this, info.Offset, 
width, info.Substring, info.IgnoreCase);
                        }

                        return new AnchorInfo (this, width);
                }

                private int min, max;
                private bool lazy;
        }

        // assertions

        abstract class Assertion : CompositeExpression {
                public Assertion () {
                        Expressions.Add (null);         // true expression
                        Expressions.Add (null);         // false expression
                }

                public Expression TrueExpression {
                        get { return Expressions[0]; }
                        set { Expressions[0] = value; }
                }

                public Expression FalseExpression {
                        get { return Expressions[1]; }
                        set { Expressions[1] = value; }
                }

                public override void GetWidth (out int min, out int max) {
                        GetWidth (out min, out max, 2);

                        if (TrueExpression == null || FalseExpression == null)
                                min = 0;
                }
        }

        class CaptureAssertion : Assertion {
                public CaptureAssertion () {
                }

                public CapturingGroup CapturingGroup {
                        get { return group; }
                        set { group = value; }
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        int gid = group.Number;
                        LinkRef tail = cmp.NewLink ();

                        if (FalseExpression == null) {
                                //    IfDefined :1
                                //      <yes_exp>
                                // 1: <tail>
                        
                                cmp.EmitIfDefined (gid, tail);
                                TrueExpression.Compile (cmp, reverse);
                        }
                        else {
                                //    IfDefined :1
                                //      <yes_expr>
                                //      Jump :2
                                // 1:   <no_expr>
                                // 2: <tail>
                        
                                LinkRef false_expr = cmp.NewLink ();
                                cmp.EmitIfDefined (gid, false_expr);
                                TrueExpression.Compile (cmp, reverse);
                                cmp.EmitJump (tail);
                                cmp.ResolveLink (false_expr);
                                FalseExpression.Compile (cmp, reverse);
                        }

                        cmp.ResolveLink (tail);
                }

                public override bool IsComplex () {
                        bool comp = false;
                        if (TrueExpression != null)
                                comp |= TrueExpression.IsComplex ();
                        if (FalseExpression != null)
                                comp |= FalseExpression.IsComplex ();

                        return comp | GetFixedWidth () <= 0;
                }

                private CapturingGroup group;
        }

        class ExpressionAssertion : Assertion {
                public ExpressionAssertion () {
                        Expressions.Add (null);         // test expression
                }

                public bool Reverse {
                        get { return reverse; }
                        set { reverse = value; }
                }

                public bool Negate {
                        get { return negate; }
                        set { negate = value; }
                }

                public Expression TestExpression {
                        get { return Expressions[2]; }
                        set { Expressions[2] = value; }
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        LinkRef true_expr = cmp.NewLink ();
                        LinkRef false_expr = cmp.NewLink ();

                        // test op: positive / negative

                        if (!negate)
                                cmp.EmitTest (true_expr, false_expr);
                        else
                                cmp.EmitTest (false_expr, true_expr);
                        
                        // test expression: lookahead / lookbehind

                        TestExpression.Compile (cmp, reverse ^ this.reverse);
                        cmp.EmitTrue ();

                        // target expressions

                        if (TrueExpression == null) {                   // (?= 
...)
                                //    Test :1, :2
                                //      <test_expr>
                                // :2   False
                                // :1   <tail>
                        
                                cmp.ResolveLink (false_expr);
                                cmp.EmitFalse ();
                                cmp.ResolveLink (true_expr);
                        }
                        else {
                                cmp.ResolveLink (true_expr);
                                TrueExpression.Compile (cmp, reverse);
                                
                                if (FalseExpression == null) {          // 
(?(...) ...)
                                        //    Test :1, :2
                                        //      <test_expr>
                                        // :1   <yes_expr>
                                        // :2   <tail>

                                        cmp.ResolveLink (false_expr);
                                }
                                else {                                  // 
(?(...) ... | ...)
                                        //    Test :1, :2
                                        //      <test_expr>
                                        // :1   <yes_expr>
                                        //      Jump :3
                                        // :2   <no_expr>
                                        // :3   <tail>
                                
                                        LinkRef tail = cmp.NewLink ();
                                
                                        cmp.EmitJump (tail);
                                        cmp.ResolveLink (false_expr);
                                        FalseExpression.Compile (cmp, reverse);
                                        cmp.ResolveLink (tail);
                                }
                        }
                }

                private bool reverse, negate;
        }

        // alternation

        class Alternation : CompositeExpression {
                public Alternation () {
                }

                public ExpressionCollection Alternatives {
                        get { return Expressions; }
                }

                public void AddAlternative (Expression e) {
                        Alternatives.Add (e);
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        LinkRef next = cmp.NewLink ();
                        LinkRef tail = cmp.NewLink ();
                
                        foreach (Expression e in Alternatives) {
                                cmp.EmitBranch (next);
                                e.Compile (cmp, reverse);
                                cmp.EmitJump (tail);
                                cmp.ResolveLink (next);
                        }

                        cmp.EmitFalse ();
                        cmp.ResolveLink (tail);
                }

                public override void GetWidth (out int min, out int max) {
                        GetWidth (out min, out max, Alternatives.Count);
                }

                public override bool IsComplex () {
                        bool comp = false;
                        foreach (Expression e in Alternatives) {
                                comp |= e.IsComplex ();
                        }

                        return comp | GetFixedWidth () <= 0;
                }
        }

        // terminal expressions

        class Literal : Expression {
                public Literal (string str, bool ignore) {
                        this.str = str;
                        this.ignore = ignore;
                }

                public string String {
                        get { return str; }
                        set { str = value; }
                }

                public bool IgnoreCase {
                        get { return ignore; }
                        set { ignore = value; }
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        if (str.Length == 0)
                                return;

                        if (str.Length == 1)
                                cmp.EmitCharacter (str[0], false, ignore, 
reverse);
                        else
                                cmp.EmitString (str, ignore, reverse);
                }

                public override void GetWidth (out int min, out int max) {
                        min = max = str.Length;
                }

                public override AnchorInfo GetAnchorInfo () {
                        return new AnchorInfo (this, 0, str.Length, str, 
ignore);
                }

                public override bool IsComplex () {
                        return false;
                }

                private string str;
                private bool ignore;
        }

        class PositionAssertion : Expression {
                public PositionAssertion (Position pos) {
                        this.pos = pos;
                }

                public Position Position {
                        get { return pos; }
                        set { pos = value; }
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        cmp.EmitPosition (pos);
                }

                public override void GetWidth (out int min, out int max) {
                        min = max = 0;
                }

                public override bool IsComplex () {
                        return false;
                }

                public override AnchorInfo GetAnchorInfo () {
                        switch (pos) {
                        case Position.StartOfString: case Position.StartOfLine: 
case Position.StartOfScan:
                                return new AnchorInfo (this, 0, 0, pos);

                        default:
                                return new AnchorInfo (this, 0);
                        }
                }

                private Position pos;
        }

        class Reference : Expression {
                public Reference (bool ignore) {
                        this.ignore = ignore;
                }

                public CapturingGroup CapturingGroup {
                        get { return group; }
                        set { group = value; }
                }

                public bool IgnoreCase {
                        get { return ignore; }
                        set { ignore = value; }
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        cmp.EmitReference (group.Number, ignore, reverse);
                }

                public override void GetWidth (out int min, out int max) {
                        //group.GetWidth (out min, out max);
                        // TODO set width to referenced group for non-cyclical 
references
                        min = 0;
                        max = Int32.MaxValue;
                }

                public override bool IsComplex () {
                        return true;    // FIXME incorporate cyclic check
                }

                private CapturingGroup group;
                private bool ignore;
        }

        class CharacterClass : Expression {
                public CharacterClass (bool negate, bool ignore) {
                        this.negate = negate;
                        this.ignore = ignore;

                        intervals = new IntervalCollection ();

                        // initialize pos/neg category arrays

                        Array cat_values = Enum.GetValues (typeof (Category));
                        int cat_size = (int)(Category)cat_values.GetValue 
(cat_values.Length - 1) + 1;
                        pos_cats = new bool[cat_size];
                        neg_cats = new bool[cat_size];
                        for (int i = 0; i < cat_size; ++ i) {
                                pos_cats[i] = false;
                                neg_cats[i] = false;
                        }
                }

                public CharacterClass (Category cat, bool negate) : this 
(false, false) {
                        this.AddCategory (cat, negate);
                }

                public bool Negate {
                        get { return negate; }
                        set { negate = value; }
                }

                public bool IgnoreCase {
                        get { return ignore; }
                        set { ignore = value; }
                }

                public void AddCategory (Category cat, bool negate) {
                        int n = (int)cat;
                        
                        if (negate) {
                                if (pos_cats[n])
                                        pos_cats[n] = false;

                                neg_cats[n] = true;
                        }
                        else {
                                if (neg_cats[n])
                                        neg_cats[n] = false;

                                pos_cats[n] = true;
                        }
                }

                public void AddCharacter (char c) {
                        intervals.Add (new Interval (c, c));
                }

                public void AddRange (char lo, char hi) {
                        intervals.Add (new Interval (lo, hi));
                }

                public override void Compile (ICompiler cmp, bool reverse) {
                        // create the meta-collection

                        IntervalCollection meta =
                                intervals.GetMetaCollection (new 
IntervalCollection.CostDelegate (GetIntervalCost));

                        // count ops
                        
                        int count = meta.Count;
                        for (int i = 0; i < pos_cats.Length; ++ i) {
                                if (pos_cats[i]) ++ count;
                                if (neg_cats[i]) ++ count;
                        }

                        if (count == 0)
                                return;

                        // emit in op for |meta| > 1

                        LinkRef tail = cmp.NewLink ();
                        if (count > 1)
                                cmp.EmitIn (tail);

                        // emit categories

                        for (int i = 0; i < pos_cats.Length; ++ i) {
                                if (pos_cats[i])
                                        cmp.EmitCategory ((Category)i, negate, 
reverse);
                                else if (neg_cats[i])
                                        cmp.EmitCategory ((Category)i, !negate, 
reverse);
                        }

                        // emit character/range/sets from meta-collection

                        foreach (Interval a in meta) {
                                if (a.IsDiscontiguous) {                        
// Set
                                        BitArray bits = new BitArray (a.Size);
                                        foreach (Interval b in intervals) {
                                                if (a.Contains (b)) {
                                                        for (int i = b.low; i 
<= b.high; ++ i)
                                                                bits[i - a.low] 
= true;
                                                }
                                        }

                                        cmp.EmitSet ((char)a.low, bits, negate, 
ignore, reverse);
                                }
                                else if (a.IsSingleton)                         
// Character
                                        cmp.EmitCharacter ((char)a.low, negate, 
ignore, reverse);
                                else                                            
// Range
                                        cmp.EmitRange ((char)a.low, 
(char)a.high, negate, ignore, reverse);
                        }
                        
                        // finish up

                        if (count > 1) {
                                if (negate)
                                        cmp.EmitTrue ();
                                else
                                        cmp.EmitFalse ();

                                cmp.ResolveLink (tail);
                        }
                }

                public override void GetWidth (out int min, out int max) {
                        min = max = 1;
                }

                public override bool IsComplex () {
                        return false;
                }

                // private

                private static double GetIntervalCost (Interval i) {
                        // use op length as cost metric (=> optimize for space)
                
                        if (i.IsDiscontiguous)
                                return 3 + ((i.Size + 0xf) >> 4);               
// Set
                        else if (i.IsSingleton)
                                return 2;                                       
// Character
                        else
                                return 3;                                       
// Range
                }

                private bool negate, ignore;
                private bool[] pos_cats, neg_cats;
                private IntervalCollection intervals;
        }

        class AnchorInfo {
                private Expression expr;

                private Position pos;
                private int offset;

                private string str;
                private int width;
                private bool ignore;

                public AnchorInfo (Expression expr, int width) {
                        this.expr = expr;
                        this.offset = 0;
                        this.width = width;

                        this.str = null;
                        this.ignore = false;
                        this.pos = Position.Any;
                }
                
                public AnchorInfo (Expression expr, int offset, int width, 
string str, bool ignore) {
                        this.expr = expr;
                        this.offset = offset;
                        this.width = width;

                        this.str = ignore ? str.ToLower () : str;

                        this.ignore = ignore;
                        this.pos = Position.Any;
                }

                public AnchorInfo (Expression expr, int offset, int width, 
Position pos) {
                        this.expr = expr;
                        this.offset = offset;
                        this.width = width;

                        this.pos = pos;

                        this.str = null;
                        this.ignore = false;
                }

                public Expression Expression {
                        get { return expr; }
                }

                public int Offset {
                        get { return offset; }
                }

                public int Width {
                        get { return width; }
                }

                public int Length {
                        get { return (str != null) ? str.Length : 0; }
                }

                public bool IsUnknownWidth {
                        get { return width < 0; }
                }

                public bool IsComplete {
                        get { return Length == Width; }
                }

                public string Substring {
                        get { return str; }
                }

                public bool IgnoreCase {
                        get { return ignore; }
                }

                public Position Position {
                        get { return pos; }
                }

                public bool IsSubstring {
                        get { return str != null; }
                }

                public bool IsPosition {
                        get { return pos != Position.Any; }
                }

                public Interval GetInterval () {
                        return GetInterval (0);
                }

                public Interval GetInterval (int start) {
                        if (!IsSubstring)
                                return Interval.Empty;

                        return new Interval (start + Offset, start + Offset + 
Length - 1);
                }
        }
}

--- NEW FILE ---
2002-11-12 Jackson Harper <address@hidden>

        * arch.cs compiler.cs regex.cs: Added mapping attribute to 
MachineFactories

2002-11-06  Gonzalo Paniagua Javier <address@hidden>

        * parser.cs: detect illegal \ at end of pattern. Fixes 31334.

2002-10-25  Gonzalo Paniagua Javier <address@hidden>

        * parser.cs: applied fix from Tim Haynes (address@hidden) to
        solve bug #32807. Also modified GetMapping to return the same as MS.

2002-08-28  Juli Mallett  <address@hidden>

        * arch.cs, compiler.cs: Give the interpreter machine a property
        for the retrieval of the group count.

        * regex.cs: Use the new GroupCount property of the factory to
        initialise the current group count, and restructure code to compile
        the pattern only the first time it is needed (essentially backing
        out the previous revision of regex.cs, to use the new code.)

2002-08-14  Cesar Octavio Lopez Nataren <address@hidden>

        * regex.cs: Added the ctr for ISerializable implementation and
        implemented the GetObjectData function.

2002-07-30  Juli Mallett  <address@hidden>

        * regex.cs: Fixed bug where the expression would not be
        re-evaluated for grouping purposes when factory caches were
        used, resulting in no groups being recognised after one call
        with a given pattern and no change in options.

2002-05-13  Dan Lewis  <address@hidden>

        * regex.cs: Fixed bug in split.

2002-05-08  Dan Lewis  <address@hidden>

        * interpreter.cs: Moved to an array-based stack representation
        for faster captures.

        * match.cs, collections.cs: Decoupled capture representation from
        interpreter internals.

        * cache.cs: Changed Key type from struct to class for speed.

2002-04-06  Dan Lewis  <address@hidden>

        * cache.cs: Object methods should be overridden with "override".

2002-04-04  Dan Lewis  <address@hidden>

        * RegexRunner.cs, RegexRunnerFactory.cs: MS support classes. Stubs
        added for completeness.

        * regex.cs, match.cs, collections.cs: Serializable attribute.

2002-04-04  Dan Lewis  <address@hidden>

        * regex.cs: Added static Matches and IsMatch methods.

2002-04-03  Dan Lewis  <address@hidden>

        * ChangeLog: Added changelog.

        * cache.cs: Fixed bug in MRUList.Evict.

--- NEW FILE ---
TODO:

* Need to go through everything and square it with RightToLeft matching.
  The support for this was built into an early version, and lots of things built
  afterwards are not savvy about bi-directional matching. Things that spring to
  mind: Regex match methods should start at 0 or text.Length depending on
  direction. Do split and replace need changes? Match should be aware of its
  direction (already applied some of this to NextMatch logic). The interpreter
  needs to check left and right bounds. Anchoring and substring discovery need
  to be reworked. RTL matches are going to have anchors on the right - ie $, \Z
  and \z. This should be added to the anchor logic. QuickSearch needs to work in
  reverse. There may be other stuff.... work through the code.

* Add ECMAScript support to the parser. For example, [.\w\s\d] map to ECMA
  categories instead of canonical ones [DONE]. There's different behaviour on
  backreference/octal disambiguation. Find out what the runtime behavioural
  difference is for cyclic backreferences eg (?(1)abc\1) - this is only briefly 
  mentioned in the spec. I couldn't find much on this in the ECMAScript
  specification either.

* Octal/backreference parsing needs a big fix. The rules are ridiculously 
complex.

* Add a check in QuickSearch for single character substrings. This is likely to
  be a common case. There's no need to go through a shift table. Also, have a
  look at just computing a relevant subset of the shift table and using an
  (offset, size) pair to help test inclusion. Characters not in the table get
  the default len + 1 shift.

* Improve the perl test suite. Run under MS runtime to generate checksums for
  each trial. Checksums should incorporate: all captures (index, length) for all
  groups; names of explicit capturing groups, and the numbers they map to. Any
  other state? RegexTrial.Execute() will then compare result and checksum.

* The pattern (?(1?)a|b). It should fail: Perl fails, the MS implementation
  fails, but I pass. The documentation says that the construct (?(X)...) can be
  processed in two ways. If X is a valid group number, or a valid group name,
  then the expression becomes a capture conditional - the (...) part is
  executed only if X has been captured. If X is not a group number or name, then
  it is treated as a positive lookahead., and (...) is only executed if the
  lookahead succeeds. My code does the latter, but on further investigation it
  appears that both Perl and MS fail to recognize an expression assertion if the
  first character of the assertion is a number - which instead suggests a
  capture conditional. The exception raised is something like "invalid group
  number". I get the feeling the my behaviour seems more correct, but it's not
  consistent with the other implementations, so it should probably be changed.

--- NEW FILE ---
This code was originally sourced from the Mono C# library. It is 
available under the X11 license




reply via email to

[Prev in Thread] Current Thread [Next in Thread]