/[pcre]/code/trunk/doc/html/pcrepattern.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcrepattern.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 211 by ph10, Thu Aug 9 09:52:43 2007 UTC revision 231 by ph10, Tue Sep 11 11:15:33 2007 UTC
# Line 14  man page, in case the conversion went wr Line 14  man page, in case the conversion went wr
14  <br>  <br>
15  <ul>  <ul>
16  <li><a name="TOC1" href="#SEC1">PCRE REGULAR EXPRESSION DETAILS</a>  <li><a name="TOC1" href="#SEC1">PCRE REGULAR EXPRESSION DETAILS</a>
17  <li><a name="TOC2" href="#SEC2">CHARACTERS AND METACHARACTERS</a>  <li><a name="TOC2" href="#SEC2">NEWLINE CONVENTIONS</a>
18  <li><a name="TOC3" href="#SEC3">BACKSLASH</a>  <li><a name="TOC3" href="#SEC3">CHARACTERS AND METACHARACTERS</a>
19  <li><a name="TOC4" href="#SEC4">CIRCUMFLEX AND DOLLAR</a>  <li><a name="TOC4" href="#SEC4">BACKSLASH</a>
20  <li><a name="TOC5" href="#SEC5">FULL STOP (PERIOD, DOT)</a>  <li><a name="TOC5" href="#SEC5">CIRCUMFLEX AND DOLLAR</a>
21  <li><a name="TOC6" href="#SEC6">MATCHING A SINGLE BYTE</a>  <li><a name="TOC6" href="#SEC6">FULL STOP (PERIOD, DOT)</a>
22  <li><a name="TOC7" href="#SEC7">SQUARE BRACKETS AND CHARACTER CLASSES</a>  <li><a name="TOC7" href="#SEC7">MATCHING A SINGLE BYTE</a>
23  <li><a name="TOC8" href="#SEC8">POSIX CHARACTER CLASSES</a>  <li><a name="TOC8" href="#SEC8">SQUARE BRACKETS AND CHARACTER CLASSES</a>
24  <li><a name="TOC9" href="#SEC9">VERTICAL BAR</a>  <li><a name="TOC9" href="#SEC9">POSIX CHARACTER CLASSES</a>
25  <li><a name="TOC10" href="#SEC10">INTERNAL OPTION SETTING</a>  <li><a name="TOC10" href="#SEC10">VERTICAL BAR</a>
26  <li><a name="TOC11" href="#SEC11">SUBPATTERNS</a>  <li><a name="TOC11" href="#SEC11">INTERNAL OPTION SETTING</a>
27  <li><a name="TOC12" href="#SEC12">DUPLICATE SUBPATTERN NUMBERS</a>  <li><a name="TOC12" href="#SEC12">SUBPATTERNS</a>
28  <li><a name="TOC13" href="#SEC13">NAMED SUBPATTERNS</a>  <li><a name="TOC13" href="#SEC13">DUPLICATE SUBPATTERN NUMBERS</a>
29  <li><a name="TOC14" href="#SEC14">REPETITION</a>  <li><a name="TOC14" href="#SEC14">NAMED SUBPATTERNS</a>
30  <li><a name="TOC15" href="#SEC15">ATOMIC GROUPING AND POSSESSIVE QUANTIFIERS</a>  <li><a name="TOC15" href="#SEC15">REPETITION</a>
31  <li><a name="TOC16" href="#SEC16">BACK REFERENCES</a>  <li><a name="TOC16" href="#SEC16">ATOMIC GROUPING AND POSSESSIVE QUANTIFIERS</a>
32  <li><a name="TOC17" href="#SEC17">ASSERTIONS</a>  <li><a name="TOC17" href="#SEC17">BACK REFERENCES</a>
33  <li><a name="TOC18" href="#SEC18">CONDITIONAL SUBPATTERNS</a>  <li><a name="TOC18" href="#SEC18">ASSERTIONS</a>
34  <li><a name="TOC19" href="#SEC19">COMMENTS</a>  <li><a name="TOC19" href="#SEC19">CONDITIONAL SUBPATTERNS</a>
35  <li><a name="TOC20" href="#SEC20">RECURSIVE PATTERNS</a>  <li><a name="TOC20" href="#SEC20">COMMENTS</a>
36  <li><a name="TOC21" href="#SEC21">SUBPATTERNS AS SUBROUTINES</a>  <li><a name="TOC21" href="#SEC21">RECURSIVE PATTERNS</a>
37  <li><a name="TOC22" href="#SEC22">CALLOUTS</a>  <li><a name="TOC22" href="#SEC22">SUBPATTERNS AS SUBROUTINES</a>
38  <li><a name="TOC23" href="#SEC23">BACTRACKING CONTROL</a>  <li><a name="TOC23" href="#SEC23">CALLOUTS</a>
39  <li><a name="TOC24" href="#SEC24">SEE ALSO</a>  <li><a name="TOC24" href="#SEC24">BACTRACKING CONTROL</a>
40  <li><a name="TOC25" href="#SEC25">AUTHOR</a>  <li><a name="TOC25" href="#SEC25">SEE ALSO</a>
41  <li><a name="TOC26" href="#SEC26">REVISION</a>  <li><a name="TOC26" href="#SEC26">AUTHOR</a>
42    <li><a name="TOC27" href="#SEC27">REVISION</a>
43  </ul>  </ul>
44  <br><a name="SEC1" href="#TOC1">PCRE REGULAR EXPRESSION DETAILS</a><br>  <br><a name="SEC1" href="#TOC1">PCRE REGULAR EXPRESSION DETAILS</a><br>
45  <P>  <P>
# Line 74  discussed in the Line 75  discussed in the
75  <a href="pcrematching.html"><b>pcrematching</b></a>  <a href="pcrematching.html"><b>pcrematching</b></a>
76  page.  page.
77  </P>  </P>
78  <br><a name="SEC2" href="#TOC1">CHARACTERS AND METACHARACTERS</a><br>  <br><a name="SEC2" href="#TOC1">NEWLINE CONVENTIONS</a><br>
79    <P>
80    PCRE supports five different conventions for indicating line breaks in
81    strings: a single CR (carriage return) character, a single LF (linefeed)
82    character, the two-character sequence CRLF, any of the three preceding, or any
83    Unicode newline sequence. The
84    <a href="pcreapi.html"><b>pcreapi</b></a>
85    page has
86    <a href="pcreapi.html#newlines">further discussion</a>
87    about newlines, and shows how to set the newline convention in the
88    <i>options</i> arguments for the compiling and matching functions.
89    </P>
90    <P>
91    It is also possible to specify a newline convention by starting a pattern
92    string with one of the following five sequences:
93    <pre>
94      (*CR)        carriage return
95      (*LF)        linefeed
96      (*CRLF)      carriage return, followed by linefeed
97      (*ANYCRLF)   any of the three above
98      (*ANY)       all Unicode newline sequences
99    </pre>
100    These override the default and the options given to <b>pcre_compile()</b>. For
101    example, on a Unix system where LF is the default newline sequence, the pattern
102    <pre>
103      (*CR)a.b
104    </pre>
105    changes the convention to CR. That pattern matches "a\nb" because LF is no
106    longer a newline. Note that these special settings, which are not
107    Perl-compatible, are recognized only at the very start of a pattern, and that
108    they must be in upper case. If more than one of them is present, the last one
109    is used.
110    </P>
111    <P>
112    The newline convention does not affect what the \R escape sequence matches. By
113    default, this is any Unicode newline sequence, for Perl compatibility. However,
114    this can be changed; see the description of \R in the section entitled
115    <a href="#newlineseq">"Newline sequences"</a>
116    below.
117    </P>
118    <br><a name="SEC3" href="#TOC1">CHARACTERS AND METACHARACTERS</a><br>
119  <P>  <P>
120  A regular expression is a pattern that is matched against a subject string from  A regular expression is a pattern that is matched against a subject string from
121  left to right. Most characters stand for themselves in a pattern, and match the  left to right. Most characters stand for themselves in a pattern, and match the
# Line 131  a character class the only metacharacter Line 172  a character class the only metacharacter
172  </pre>  </pre>
173  The following sections describe the use of each of the metacharacters.  The following sections describe the use of each of the metacharacters.
174  </P>  </P>
175  <br><a name="SEC3" href="#TOC1">BACKSLASH</a><br>  <br><a name="SEC4" href="#TOC1">BACKSLASH</a><br>
176  <P>  <P>
177  The backslash character has several uses. Firstly, if it is followed by a  The backslash character has several uses. Firstly, if it is followed by a
178  non-alphanumeric character, it takes away any special meaning that character  non-alphanumeric character, it takes away any special meaning that character
# Line 180  represents: Line 221  represents:
221    \cx       "control-x", where x is any character    \cx       "control-x", where x is any character
222    \e        escape (hex 1B)    \e        escape (hex 1B)
223    \f        formfeed (hex 0C)    \f        formfeed (hex 0C)
224    \n        newline (hex 0A)    \n        linefeed (hex 0A)
225    \r        carriage return (hex 0D)    \r        carriage return (hex 0D)
226    \t        tab (hex 09)    \t        tab (hex 09)
227    \ddd      character with octal code ddd, or backreference    \ddd      character with octal code ddd, or backreference
# Line 358  page). For example, in a French locale s Line 399  page). For example, in a French locale s
399  or "french" in Windows, some character codes greater than 128 are used for  or "french" in Windows, some character codes greater than 128 are used for
400  accented letters, and these are matched by \w. The use of locales with Unicode  accented letters, and these are matched by \w. The use of locales with Unicode
401  is discouraged.  is discouraged.
402  </P>  <a name="newlineseq"></a></P>
403  <br><b>  <br><b>
404  Newline sequences  Newline sequences
405  </b><br>  </b><br>
406  <P>  <P>
407  Outside a character class, the escape sequence \R matches any Unicode newline  Outside a character class, by default, the escape sequence \R matches any
408  sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \R is equivalent to  Unicode newline sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \R is
409  the following:  equivalent to the following:
410  <pre>  <pre>
411    (?&#62;\r\n|\n|\x0b|\f|\r|\x85)    (?&#62;\r\n|\n|\x0b|\f|\r|\x85)
412  </pre>  </pre>
# Line 384  Unicode character property support is no Line 425  Unicode character property support is no
425  recognized.  recognized.
426  </P>  </P>
427  <P>  <P>
428    It is possible to restrict \R to match only CR, LF, or CRLF (instead of the
429    complete set of Unicode line endings) by setting the option PCRE_BSR_ANYCRLF
430    either at compile time or when the pattern is matched. This can be made the
431    default when PCRE is built; if this is the case, the other behaviour can be
432    requested via the PCRE_BSR_UNICODE option. It is also possible to specify these
433    settings by starting a pattern string with one of the following sequences:
434    <pre>
435      (*BSR_ANYCRLF)   CR, LF, or CRLF only
436      (*BSR_UNICODE)   any Unicode newline sequence
437    </pre>
438    These override the default and the options given to <b>pcre_compile()</b>, but
439    they can be overridden by options given to <b>pcre_exec()</b>. Note that these
440    special settings, which are not Perl-compatible, are recognized only at the
441    very start of a pattern, and that they must be in upper case. If more than one
442    of them is present, the last one is used.
443    </P>
444    <P>
445  Inside a character class, \R matches the letter "R".  Inside a character class, \R matches the letter "R".
446  <a name="uniextseq"></a></P>  <a name="uniextseq"></a></P>
447  <br><b>  <br><b>
# Line 675  If all the alternatives of a pattern beg Line 733  If all the alternatives of a pattern beg
733  to the starting match position, and the "anchored" flag is set in the compiled  to the starting match position, and the "anchored" flag is set in the compiled
734  regular expression.  regular expression.
735  </P>  </P>
736  <br><a name="SEC4" href="#TOC1">CIRCUMFLEX AND DOLLAR</a><br>  <br><a name="SEC5" href="#TOC1">CIRCUMFLEX AND DOLLAR</a><br>
737  <P>  <P>
738  Outside a character class, in the default matching mode, the circumflex  Outside a character class, in the default matching mode, the circumflex
739  character is an assertion that is true only if the current matching point is  character is an assertion that is true only if the current matching point is
# Line 729  Note that the sequences \A, \Z, and \z c Line 787  Note that the sequences \A, \Z, and \z c
787  end of the subject in both modes, and if all branches of a pattern start with  end of the subject in both modes, and if all branches of a pattern start with
788  \A it is always anchored, whether or not PCRE_MULTILINE is set.  \A it is always anchored, whether or not PCRE_MULTILINE is set.
789  </P>  </P>
790  <br><a name="SEC5" href="#TOC1">FULL STOP (PERIOD, DOT)</a><br>  <br><a name="SEC6" href="#TOC1">FULL STOP (PERIOD, DOT)</a><br>
791  <P>  <P>
792  Outside a character class, a dot in the pattern matches any one character in  Outside a character class, a dot in the pattern matches any one character in
793  the subject string except (by default) a character that signifies the end of a  the subject string except (by default) a character that signifies the end of a
# Line 754  The handling of dot is entirely independ Line 812  The handling of dot is entirely independ
812  dollar, the only relationship being that they both involve newlines. Dot has no  dollar, the only relationship being that they both involve newlines. Dot has no
813  special meaning in a character class.  special meaning in a character class.
814  </P>  </P>
815  <br><a name="SEC6" href="#TOC1">MATCHING A SINGLE BYTE</a><br>  <br><a name="SEC7" href="#TOC1">MATCHING A SINGLE BYTE</a><br>
816  <P>  <P>
817  Outside a character class, the escape sequence \C matches any one byte, both  Outside a character class, the escape sequence \C matches any one byte, both
818  in and out of UTF-8 mode. Unlike a dot, it always matches any line-ending  in and out of UTF-8 mode. Unlike a dot, it always matches any line-ending
# Line 769  PCRE does not allow \C to appear in look Line 827  PCRE does not allow \C to appear in look
827  because in UTF-8 mode this would make it impossible to calculate the length of  because in UTF-8 mode this would make it impossible to calculate the length of
828  the lookbehind.  the lookbehind.
829  <a name="characterclass"></a></P>  <a name="characterclass"></a></P>
830  <br><a name="SEC7" href="#TOC1">SQUARE BRACKETS AND CHARACTER CLASSES</a><br>  <br><a name="SEC8" href="#TOC1">SQUARE BRACKETS AND CHARACTER CLASSES</a><br>
831  <P>  <P>
832  An opening square bracket introduces a character class, terminated by a closing  An opening square bracket introduces a character class, terminated by a closing
833  square bracket. A closing square bracket on its own is not special. If a  square bracket. A closing square bracket on its own is not special. If a
# Line 864  introducing a POSIX class name - see the Line 922  introducing a POSIX class name - see the
922  closing square bracket. However, escaping other non-alphanumeric characters  closing square bracket. However, escaping other non-alphanumeric characters
923  does no harm.  does no harm.
924  </P>  </P>
925  <br><a name="SEC8" href="#TOC1">POSIX CHARACTER CLASSES</a><br>  <br><a name="SEC9" href="#TOC1">POSIX CHARACTER CLASSES</a><br>
926  <P>  <P>
927  Perl supports the POSIX notation for character classes. This uses names  Perl supports the POSIX notation for character classes. This uses names
928  enclosed by [: and :] within the enclosing square brackets. PCRE also supports  enclosed by [: and :] within the enclosing square brackets. PCRE also supports
# Line 910  supported, and an error is given if they Line 968  supported, and an error is given if they
968  In UTF-8 mode, characters with values greater than 128 do not match any of  In UTF-8 mode, characters with values greater than 128 do not match any of
969  the POSIX character classes.  the POSIX character classes.
970  </P>  </P>
971  <br><a name="SEC9" href="#TOC1">VERTICAL BAR</a><br>  <br><a name="SEC10" href="#TOC1">VERTICAL BAR</a><br>
972  <P>  <P>
973  Vertical bar characters are used to separate alternative patterns. For example,  Vertical bar characters are used to separate alternative patterns. For example,
974  the pattern  the pattern
# Line 925  that succeeds is used. If the alternativ Line 983  that succeeds is used. If the alternativ
983  "succeeds" means matching the rest of the main pattern as well as the  "succeeds" means matching the rest of the main pattern as well as the
984  alternative in the subpattern.  alternative in the subpattern.
985  </P>  </P>
986  <br><a name="SEC10" href="#TOC1">INTERNAL OPTION SETTING</a><br>  <br><a name="SEC11" href="#TOC1">INTERNAL OPTION SETTING</a><br>
987  <P>  <P>
988  The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and  The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and
989  PCRE_EXTENDED options can be changed from within the pattern by a sequence of  PCRE_EXTENDED options can be changed from within the pattern by a sequence of
# Line 973  The PCRE-specific options PCRE_DUPNAMES, Line 1031  The PCRE-specific options PCRE_DUPNAMES,
1031  changed in the same way as the Perl-compatible options by using the characters  changed in the same way as the Perl-compatible options by using the characters
1032  J, U and X respectively.  J, U and X respectively.
1033  <a name="subpattern"></a></P>  <a name="subpattern"></a></P>
1034  <br><a name="SEC11" href="#TOC1">SUBPATTERNS</a><br>  <br><a name="SEC12" href="#TOC1">SUBPATTERNS</a><br>
1035  <P>  <P>
1036  Subpatterns are delimited by parentheses (round brackets), which can be nested.  Subpatterns are delimited by parentheses (round brackets), which can be nested.
1037  Turning part of a pattern into a subpattern does two things:  Turning part of a pattern into a subpattern does two things:
# Line 1027  from left to right, and options are not Line 1085  from left to right, and options are not
1085  is reached, an option setting in one branch does affect subsequent branches, so  is reached, an option setting in one branch does affect subsequent branches, so
1086  the above patterns match "SUNDAY" as well as "Saturday".  the above patterns match "SUNDAY" as well as "Saturday".
1087  </P>  </P>
1088  <br><a name="SEC12" href="#TOC1">DUPLICATE SUBPATTERN NUMBERS</a><br>  <br><a name="SEC13" href="#TOC1">DUPLICATE SUBPATTERN NUMBERS</a><br>
1089  <P>  <P>
1090  Perl 5.10 introduced a feature whereby each alternative in a subpattern uses  Perl 5.10 introduced a feature whereby each alternative in a subpattern uses
1091  the same numbers for its capturing parentheses. Such a subpattern starts with  the same numbers for its capturing parentheses. Such a subpattern starts with
# Line 1058  the first one in the pattern with the gi Line 1116  the first one in the pattern with the gi
1116  An alternative approach to using this "branch reset" feature is to use  An alternative approach to using this "branch reset" feature is to use
1117  duplicate named subpatterns, as described in the next section.  duplicate named subpatterns, as described in the next section.
1118  </P>  </P>
1119  <br><a name="SEC13" href="#TOC1">NAMED SUBPATTERNS</a><br>  <br><a name="SEC14" href="#TOC1">NAMED SUBPATTERNS</a><br>
1120  <P>  <P>
1121  Identifying capturing parentheses by number is simple, but it can be very hard  Identifying capturing parentheses by number is simple, but it can be very hard
1122  to keep track of the numbers in complicated regular expressions. Furthermore,  to keep track of the numbers in complicated regular expressions. Furthermore,
# Line 1113  details of the interfaces for handling n Line 1171  details of the interfaces for handling n
1171  <a href="pcreapi.html"><b>pcreapi</b></a>  <a href="pcreapi.html"><b>pcreapi</b></a>
1172  documentation.  documentation.
1173  </P>  </P>
1174  <br><a name="SEC14" href="#TOC1">REPETITION</a><br>  <br><a name="SEC15" href="#TOC1">REPETITION</a><br>
1175  <P>  <P>
1176  Repetition is specified by quantifiers, which can follow any of the following  Repetition is specified by quantifiers, which can follow any of the following
1177  items:  items:
# Line 1264  example, after Line 1322  example, after
1322  </pre>  </pre>
1323  matches "aba" the value of the second captured substring is "b".  matches "aba" the value of the second captured substring is "b".
1324  <a name="atomicgroup"></a></P>  <a name="atomicgroup"></a></P>
1325  <br><a name="SEC15" href="#TOC1">ATOMIC GROUPING AND POSSESSIVE QUANTIFIERS</a><br>  <br><a name="SEC16" href="#TOC1">ATOMIC GROUPING AND POSSESSIVE QUANTIFIERS</a><br>
1326  <P>  <P>
1327  With both maximizing ("greedy") and minimizing ("ungreedy" or "lazy")  With both maximizing ("greedy") and minimizing ("ungreedy" or "lazy")
1328  repetition, failure of what follows normally causes the repeated item to be  repetition, failure of what follows normally causes the repeated item to be
# Line 1368  an atomic group, like this: Line 1426  an atomic group, like this:
1426  </pre>  </pre>
1427  sequences of non-digits cannot be broken, and failure happens quickly.  sequences of non-digits cannot be broken, and failure happens quickly.
1428  <a name="backreferences"></a></P>  <a name="backreferences"></a></P>
1429  <br><a name="SEC16" href="#TOC1">BACK REFERENCES</a><br>  <br><a name="SEC17" href="#TOC1">BACK REFERENCES</a><br>
1430  <P>  <P>
1431  Outside a character class, a backslash followed by a digit greater than 0 (and  Outside a character class, a backslash followed by a digit greater than 0 (and
1432  possibly further digits) is a back reference to a capturing subpattern earlier  possibly further digits) is a back reference to a capturing subpattern earlier
# Line 1482  that the first iteration does not need t Line 1540  that the first iteration does not need t
1540  done using alternation, as in the example above, or by a quantifier with a  done using alternation, as in the example above, or by a quantifier with a
1541  minimum of zero.  minimum of zero.
1542  <a name="bigassertions"></a></P>  <a name="bigassertions"></a></P>
1543  <br><a name="SEC17" href="#TOC1">ASSERTIONS</a><br>  <br><a name="SEC18" href="#TOC1">ASSERTIONS</a><br>
1544  <P>  <P>
1545  An assertion is a test on the characters following or preceding the current  An assertion is a test on the characters following or preceding the current
1546  matching point that does not actually consume any characters. The simple  matching point that does not actually consume any characters. The simple
# Line 1642  preceded by "foo", while Line 1700  preceded by "foo", while
1700  is another pattern that matches "foo" preceded by three digits and any three  is another pattern that matches "foo" preceded by three digits and any three
1701  characters that are not "999".  characters that are not "999".
1702  <a name="conditions"></a></P>  <a name="conditions"></a></P>
1703  <br><a name="SEC18" href="#TOC1">CONDITIONAL SUBPATTERNS</a><br>  <br><a name="SEC19" href="#TOC1">CONDITIONAL SUBPATTERNS</a><br>
1704  <P>  <P>
1705  It is possible to cause the matching process to obey a subpattern  It is possible to cause the matching process to obey a subpattern
1706  conditionally or to choose between two alternative subpatterns, depending on  conditionally or to choose between two alternative subpatterns, depending on
# Line 1780  subject is matched against the first alt Line 1838  subject is matched against the first alt
1838  against the second. This pattern matches strings in one of the two forms  against the second. This pattern matches strings in one of the two forms
1839  dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.  dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.
1840  <a name="comments"></a></P>  <a name="comments"></a></P>
1841  <br><a name="SEC19" href="#TOC1">COMMENTS</a><br>  <br><a name="SEC20" href="#TOC1">COMMENTS</a><br>
1842  <P>  <P>
1843  The sequence (?# marks the start of a comment that continues up to the next  The sequence (?# marks the start of a comment that continues up to the next
1844  closing parenthesis. Nested parentheses are not permitted. The characters  closing parenthesis. Nested parentheses are not permitted. The characters
# Line 1791  If the PCRE_EXTENDED option is set, an u Line 1849  If the PCRE_EXTENDED option is set, an u
1849  character class introduces a comment that continues to immediately after the  character class introduces a comment that continues to immediately after the
1850  next newline in the pattern.  next newline in the pattern.
1851  <a name="recursion"></a></P>  <a name="recursion"></a></P>
1852  <br><a name="SEC20" href="#TOC1">RECURSIVE PATTERNS</a><br>  <br><a name="SEC21" href="#TOC1">RECURSIVE PATTERNS</a><br>
1853  <P>  <P>
1854  Consider the problem of matching a string in parentheses, allowing for  Consider the problem of matching a string in parentheses, allowing for
1855  unlimited nested parentheses. Without the use of recursion, the best that can  unlimited nested parentheses. Without the use of recursion, the best that can
# Line 1921  In this pattern, (?(R) is the start of a Line 1979  In this pattern, (?(R) is the start of a
1979  different alternatives for the recursive and non-recursive cases. The (?R) item  different alternatives for the recursive and non-recursive cases. The (?R) item
1980  is the actual recursive call.  is the actual recursive call.
1981  <a name="subpatternsassubroutines"></a></P>  <a name="subpatternsassubroutines"></a></P>
1982  <br><a name="SEC21" href="#TOC1">SUBPATTERNS AS SUBROUTINES</a><br>  <br><a name="SEC22" href="#TOC1">SUBPATTERNS AS SUBROUTINES</a><br>
1983  <P>  <P>
1984  If the syntax for a recursive subpattern reference (either by number or by  If the syntax for a recursive subpattern reference (either by number or by
1985  name) is used outside the parentheses to which it refers, it operates like a  name) is used outside the parentheses to which it refers, it operates like a
# Line 1961  changed for different calls. For example Line 2019  changed for different calls. For example
2019  It matches "abcabc". It does not match "abcABC" because the change of  It matches "abcabc". It does not match "abcABC" because the change of
2020  processing option does not affect the called subpattern.  processing option does not affect the called subpattern.
2021  </P>  </P>
2022  <br><a name="SEC22" href="#TOC1">CALLOUTS</a><br>  <br><a name="SEC23" href="#TOC1">CALLOUTS</a><br>
2023  <P>  <P>
2024  Perl has a feature whereby using the sequence (?{...}) causes arbitrary Perl  Perl has a feature whereby using the sequence (?{...}) causes arbitrary Perl
2025  code to be obeyed in the middle of matching a regular expression. This makes it  code to be obeyed in the middle of matching a regular expression. This makes it
# Line 1996  description of the interface to the call Line 2054  description of the interface to the call
2054  <a href="pcrecallout.html"><b>pcrecallout</b></a>  <a href="pcrecallout.html"><b>pcrecallout</b></a>
2055  documentation.  documentation.
2056  </P>  </P>
2057  <br><a name="SEC23" href="#TOC1">BACTRACKING CONTROL</a><br>  <br><a name="SEC24" href="#TOC1">BACTRACKING CONTROL</a><br>
2058  <P>  <P>
2059  Perl 5.10 introduced a number of "Special Backtracking Control Verbs", which  Perl 5.10 introduced a number of "Special Backtracking Control Verbs", which
2060  are described in the Perl documentation as "experimental and subject to change  are described in the Perl documentation as "experimental and subject to change
# Line 2111  the end of the group if FOO succeeds); o Line 2169  the end of the group if FOO succeeds); o
2169  second alternative and tries COND2, without backtracking into COND1. If (*THEN)  second alternative and tries COND2, without backtracking into COND1. If (*THEN)
2170  is used outside of any alternation, it acts exactly like (*PRUNE).  is used outside of any alternation, it acts exactly like (*PRUNE).
2171  </P>  </P>
2172  <br><a name="SEC24" href="#TOC1">SEE ALSO</a><br>  <br><a name="SEC25" href="#TOC1">SEE ALSO</a><br>
2173  <P>  <P>
2174  <b>pcreapi</b>(3), <b>pcrecallout</b>(3), <b>pcrematching</b>(3), <b>pcre</b>(3).  <b>pcreapi</b>(3), <b>pcrecallout</b>(3), <b>pcrematching</b>(3), <b>pcre</b>(3).
2175  </P>  </P>
2176  <br><a name="SEC25" href="#TOC1">AUTHOR</a><br>  <br><a name="SEC26" href="#TOC1">AUTHOR</a><br>
2177  <P>  <P>
2178  Philip Hazel  Philip Hazel
2179  <br>  <br>
# Line 2124  University Computing Service Line 2182  University Computing Service
2182  Cambridge CB2 3QH, England.  Cambridge CB2 3QH, England.
2183  <br>  <br>
2184  </P>  </P>
2185  <br><a name="SEC26" href="#TOC1">REVISION</a><br>  <br><a name="SEC27" href="#TOC1">REVISION</a><br>
2186  <P>  <P>
2187  Last updated: 09 August 2007  Last updated: 11 September 2007
2188  <br>  <br>
2189  Copyright &copy; 1997-2007 University of Cambridge.  Copyright &copy; 1997-2007 University of Cambridge.
2190  <br>  <br>

Legend:
Removed from v.211  
changed lines
  Added in v.231

  ViewVC Help
Powered by ViewVC 1.1.5