/[pcre]/code/trunk/doc/html/pcrepattern.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcrepattern.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 453 by ph10, Fri Sep 18 19:12:35 2009 UTC revision 454 by ph10, Tue Sep 22 09:42:11 2009 UTC
# Line 334  a number enclosed either in angle bracke Line 334  a number enclosed either in angle bracke
334  syntax for referencing a subpattern as a "subroutine". Details are discussed  syntax for referencing a subpattern as a "subroutine". Details are discussed
335  <a href="#onigurumasubroutines">later.</a>  <a href="#onigurumasubroutines">later.</a>
336  Note that \g{...} (Perl syntax) and \g&#60;...&#62; (Oniguruma syntax) are <i>not</i>  Note that \g{...} (Perl syntax) and \g&#60;...&#62; (Oniguruma syntax) are <i>not</i>
337  synonymous. The former is a back reference; the latter is a subroutine call.  synonymous. The former is a back reference; the latter is a
338    <a href="#subpatternsassubroutines">subroutine</a>
339    call.
340  </P>  </P>
341  <br><b>  <br><b>
342  Generic character types  Generic character types
# Line 1662  is permitted, but Line 1664  is permitted, but
1664  </pre>  </pre>
1665  causes an error at compile time. Branches that match different length strings  causes an error at compile time. Branches that match different length strings
1666  are permitted only at the top level of a lookbehind assertion. This is an  are permitted only at the top level of a lookbehind assertion. This is an
1667  extension compared with Perl (at least for 5.8), which requires all branches to  extension compared with Perl (5.8 and 5.10), which requires all branches to
1668  match the same length of string. An assertion such as  match the same length of string. An assertion such as
1669  <pre>  <pre>
1670    (?&#60;=ab(c|de))    (?&#60;=ab(c|de))
1671  </pre>  </pre>
1672  is not permitted, because its single top-level branch can match two different  is not permitted, because its single top-level branch can match two different
1673  lengths, but it is acceptable if rewritten to use two top-level branches:  lengths, but it is acceptable to PCRE if rewritten to use two top-level
1674    branches:
1675  <pre>  <pre>
1676    (?&#60;=abc|abde)    (?&#60;=abc|abde)
1677  </pre>  </pre>
1678  In some cases, the Perl 5.10 escape sequence \K  In some cases, the Perl 5.10 escape sequence \K
1679  <a href="#resetmatchstart">(see above)</a>  <a href="#resetmatchstart">(see above)</a>
1680  can be used instead of a lookbehind assertion; this is not restricted to a  can be used instead of a lookbehind assertion to get round the fixed-length
1681  fixed-length.  restriction.
1682  </P>  </P>
1683  <P>  <P>
1684  The implementation of lookbehind assertions is, for each alternative, to  The implementation of lookbehind assertions is, for each alternative, to
# Line 1690  the length of the lookbehind. The \X and Line 1693  the length of the lookbehind. The \X and
1693  different numbers of bytes, are also not permitted.  different numbers of bytes, are also not permitted.
1694  </P>  </P>
1695  <P>  <P>
1696    <a href="#subpatternsassubroutines">"Subroutine"</a>
1697    calls (see below) such as (?2) or (?&X) are permitted in lookbehinds, as long
1698    as the subpattern matches a fixed-length string.
1699    <a href="#recursion">Recursion,</a>
1700    however, is not supported.
1701    </P>
1702    <P>
1703  Possessive quantifiers can be used in conjunction with lookbehind assertions to  Possessive quantifiers can be used in conjunction with lookbehind assertions to
1704  specify efficient matching at the end of the subject string. Consider a simple  specify efficient matching at the end of the subject string. Consider a simple
1705  pattern such as  pattern such as
# Line 1841  number or name is given. This condition Line 1851  number or name is given. This condition
1851  stack.  stack.
1852  </P>  </P>
1853  <P>  <P>
1854  At "top level", all these recursion test conditions are false. Recursive  At "top level", all these recursion test conditions are false.
1855  patterns are described below.  <a href="#recursion">Recursive patterns</a>
1856    are described below.
1857  </P>  </P>
1858  <br><b>  <br><b>
1859  Defining subpatterns for use by reference only  Defining subpatterns for use by reference only
# Line 1852  If the condition is the string (DEFINE), Line 1863  If the condition is the string (DEFINE),
1863  name DEFINE, the condition is always false. In this case, there may be only one  name DEFINE, the condition is always false. In this case, there may be only one
1864  alternative in the subpattern. It is always skipped if control reaches this  alternative in the subpattern. It is always skipped if control reaches this
1865  point in the pattern; the idea of DEFINE is that it can be used to define  point in the pattern; the idea of DEFINE is that it can be used to define
1866  "subroutines" that can be referenced from elsewhere. (The use of "subroutines"  "subroutines" that can be referenced from elsewhere. (The use of
1867    <a href="#subpatternsassubroutines">"subroutines"</a>
1868  is described below.) For example, a pattern to match an IPv4 address could be  is described below.) For example, a pattern to match an IPv4 address could be
1869  written like this (ignore whitespace and line breaks):  written like this (ignore whitespace and line breaks):
1870  <pre>  <pre>
# Line 1927  this kind of recursion was subsequently Line 1939  this kind of recursion was subsequently
1939  <P>  <P>
1940  A special item that consists of (? followed by a number greater than zero and a  A special item that consists of (? followed by a number greater than zero and a
1941  closing parenthesis is a recursive call of the subpattern of the given number,  closing parenthesis is a recursive call of the subpattern of the given number,
1942  provided that it occurs inside that subpattern. (If not, it is a "subroutine"  provided that it occurs inside that subpattern. (If not, it is a
1943    <a href="#subpatternsassubroutines">"subroutine"</a>
1944  call, which is described in the next section.) The special item (?R) or (?0) is  call, which is described in the next section.) The special item (?R) or (?0) is
1945  a recursive call of the entire regular expression.  a recursive call of the entire regular expression.
1946  </P>  </P>
# Line 1963  it is encountered. Line 1976  it is encountered.
1976  It is also possible to refer to subsequently opened parentheses, by writing  It is also possible to refer to subsequently opened parentheses, by writing
1977  references such as (?+2). However, these cannot be recursive because the  references such as (?+2). However, these cannot be recursive because the
1978  reference is not inside the parentheses that are referenced. They are always  reference is not inside the parentheses that are referenced. They are always
1979  "subroutine" calls, as described in the next section.  <a href="#subpatternsassubroutines">"subroutine"</a>
1980    calls, as described in the next section.
1981  </P>  </P>
1982  <P>  <P>
1983  An alternative approach is to use named parentheses instead. The Perl syntax  An alternative approach is to use named parentheses instead. The Perl syntax
# Line 2318  Cambridge CB2 3QH, England. Line 2332  Cambridge CB2 3QH, England.
2332  </P>  </P>
2333  <br><a name="SEC28" href="#TOC1">REVISION</a><br>  <br><a name="SEC28" href="#TOC1">REVISION</a><br>
2334  <P>  <P>
2335  Last updated: 18 September 2009  Last updated: 22 September 2009
2336  <br>  <br>
2337  Copyright &copy; 1997-2009 University of Cambridge.  Copyright &copy; 1997-2009 University of Cambridge.
2338  <br>  <br>

Legend:
Removed from v.453  
changed lines
  Added in v.454

  ViewVC Help
Powered by ViewVC 1.1.5