Regex

This module is the RegCom module from the XDS compiler. Copyright (c) 1993 xTech Ltd, Russia. All Rights Reserved. Modified for inclusion in library.

A regular expression is a string which may contain certain special symbols:

  • “*” - an arbitrary sequence of any characters, possibly empty.

  • “?” - any single character.

  • “[…]” - one of the listed characters.

  • “{…}” - an arbitrary sequence of the listed characters, possibly empty.

  • “d” - shorthand for one character in the digit range [0-9]

  • “D” - shorthand for zero or more of the character in the digit range [0-9]

  • “s” - shorthand for one of the whitespace characters SPC, TAB, VTAB, LF & CR.

  • “S” - shorthand for zero or more of the whitespace characters.

  • “w” - shorthand for on character in the alphanumeric range [0-9], [a-z] and [A-Z]

  • “W” - shorthand for zero or more of the chatacters in the alphanumeric range.

  • “nnn” - the ASCII character with octal code nnn, where n is [0-7].

  • “&” - the logical operation AND.

  • “|” - the logical operation OR.

  • “^” - the logical operation NOT.

  • “(…)” - the priority of operations.

  • “$digit” - subexpression number (see below).

A sequence of the form a-b used within either [] or {} brackets denotes all characters from a to b.

$digit may follow *, ?, [], {} or () subexpression. For a string matching a regular expression, it represents the corresponding substring.

If you need to use any special symbol as an ordinary symbol, you should precede it with a backslash (``), which suppresses interpretation of the following symbol.

Types

Pattern* = POINTER TO PatternDesc;

Procedures

Dispose

Free allocated regex data

PROCEDURE Dispose*(VAR pattern : Pattern);

Compile

Compile the regular expression expr and return status:

  • res <= 0 : error at position ABS(res) in expr;

  • res > 0 : no error.

PROCEDURE Compile*(VAR reg: Pattern; expr-: ARRAY OF CHAR; VAR res: LONGINT);

Pattern.Match

Returns TRUE, if expression matches with string s starting from position pos.

PROCEDURE (re : Pattern) Match*(s-: ARRAY OF CHAR; pos : LONGINT): BOOLEAN;

Pattern.FullMatch

Returns TRUE, if expression matches with whole string s.

PROCEDURE (re : Pattern) FullMatch*(s-: ARRAY OF CHAR): BOOLEAN;

Pattern.MatchLength

Returns the length of the substring matched to $n at last call of match procedure with parameter re.

PROCEDURE (re : Pattern) MatchLength*(n : INTEGER): LONGINT;

Pattern.MatchPos

Returns the position of the beginning of the substring matched to $n at last call of match procedure with parameter re.

PROCEDURE (re : Pattern) MatchPos*(n : INTEGER): LONGINT;

Example

Regex Example

MODULE TestRegex;
IMPORT Re := Regex;
VAR
    re : Re.Pattern;
    res : LONGINT;
    ret : BOOLEAN;
    PROCEDURE Assert(b: BOOLEAN; id: LONGINT) ;
    BEGIN
        ASSERT(b);
    END Assert ;
BEGIN
    Re.Compile(re, "((\d\d-\d\d-\d\d\d\d)|(\d\d/\d\d/\d\d\d\d))", res);
    Assert(res > 0, 1);
    ret := re.FullMatch("01-01-2023");
    Assert(ret, ret > 0);
    Assert(re.MatchPos() = 0, 3);
    Assert(re.MatchLength() = 10, 4);
    ret := re.FullMatch("01/01/2023");
    Assert(ret, ret > 0);
END TestRegex;