/[pcre]/code/trunk/doc/pcrepattern.3
ViewVC logotype

Diff of /code/trunk/doc/pcrepattern.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 630 by ph10, Fri Jul 22 10:00:10 2011 UTC revision 678 by ph10, Sun Aug 28 15:23:03 2011 UTC
# Line 32  Starting a pattern with this sequence is Line 32  Starting a pattern with this sequence is
32  option. This feature is not Perl-compatible. How setting UTF-8 mode affects  option. This feature is not Perl-compatible. How setting UTF-8 mode affects
33  pattern matching is mentioned in several places below. There is also a summary  pattern matching is mentioned in several places below. There is also a summary
34  of UTF-8 features in the  of UTF-8 features in the
 .\" HTML <a href="pcre.html#utf8support">  
 .\" </a>  
 section on UTF-8 support  
 .\"  
 in the main  
35  .\" HREF  .\" HREF
36  \fBpcre\fP  \fBpcreunicode\fP
37  .\"  .\"
38  page.  page.
39  .P  .P
# Line 220  Perl, $ and @ cause variable interpolati Line 215  Perl, $ and @ cause variable interpolati
215    \eQabc\eE\e$\eQxyz\eE   abc$xyz        abc$xyz    \eQabc\eE\e$\eQxyz\eE   abc$xyz        abc$xyz
216  .sp  .sp
217  The \eQ...\eE sequence is recognized both inside and outside character classes.  The \eQ...\eE sequence is recognized both inside and outside character classes.
218  An isolated \eE that is not preceded by \eQ is ignored. If \eQ is not followed  An isolated \eE that is not preceded by \eQ is ignored. If \eQ is not followed
219  by \eE later in the pattern, the literal interpretation continues to the end of  by \eE later in the pattern, the literal interpretation continues to the end of
220  the pattern (that is, \eE is assumed at the end). If the isolated \eQ is inside  the pattern (that is, \eE is assumed at the end). If the isolated \eQ is inside
221  a character class, this causes an error, because the character class is not  a character class, this causes an error, because the character class is not
222  terminated.  terminated.
# Line 757  Characters with the "mark" property are Line 752  Characters with the "mark" property are
752  preceding character. None of them have codepoints less than 256, so in  preceding character. None of them have codepoints less than 256, so in
753  non-UTF-8 mode \eX matches any one character.  non-UTF-8 mode \eX matches any one character.
754  .P  .P
755  Note that recent versions of Perl have changed \eX to match what Unicode calls  Note that recent versions of Perl have changed \eX to match what Unicode calls
756  an "extended grapheme cluster", which has a more complicated definition.  an "extended grapheme cluster", which has a more complicated definition.
757  .P  .P
758  Matching characters by Unicode property is not fast, because PCRE has to search  Matching characters by Unicode property is not fast, because PCRE has to search
# Line 1438  items: Line 1433  items:
1433    an escape such as \ed or \epL that matches a single character    an escape such as \ed or \epL that matches a single character
1434    a character class    a character class
1435    a back reference (see next section)    a back reference (see next section)
1436    a parenthesized subpattern (unless it is an assertion)    a parenthesized subpattern (including assertions)
1437    a recursive or "subroutine" call to a subpattern    a recursive or "subroutine" call to a subpattern
1438  .sp  .sp
1439  The general repetition quantifier specifies a minimum and maximum number of  The general repetition quantifier specifies a minimum and maximum number of
# Line 1829  those that look ahead of the current pos Line 1824  those that look ahead of the current pos
1824  that look behind it. An assertion subpattern is matched in the normal way,  that look behind it. An assertion subpattern is matched in the normal way,
1825  except that it does not cause the current matching position to be changed.  except that it does not cause the current matching position to be changed.
1826  .P  .P
1827  Assertion subpatterns are not capturing subpatterns, and may not be repeated,  Assertion subpatterns are not capturing subpatterns. If such an assertion
1828  because it makes no sense to assert the same thing several times. If any kind  contains capturing subpatterns within it, these are counted for the purposes of
1829  of assertion contains capturing subpatterns within it, these are counted for  numbering the capturing subpatterns in the whole pattern. However, substring
1830  the purposes of numbering the capturing subpatterns in the whole pattern.  capturing is carried out only for positive assertions, because it does not make
1831  However, substring capturing is carried out only for positive assertions,  sense for negative assertions.
1832  because it does not make sense for negative assertions.  .P
1833    For compatibility with Perl, assertion subpatterns may be repeated; though
1834    it makes no sense to assert the same thing several times, the side effect of
1835    capturing parentheses may occasionally be useful. In practice, there only three
1836    cases:
1837    .sp
1838    (1) If the quantifier is {0}, the assertion is never obeyed during matching.
1839    However, it may contain internal capturing parenthesized groups that are called
1840    from elsewhere via the
1841    .\" HTML <a href="#subpatternsassubroutines">
1842    .\" </a>
1843    subroutine mechanism.
1844    .\"
1845    .sp
1846    (2) If quantifier is {0,n} where n is greater than zero, it is treated as if it
1847    were {0,1}. At run time, the rest of the pattern match is tried with and
1848    without the assertion, the order depending on the greediness of the quantifier.
1849    .sp
1850    (3) If the minimum repetition is greater than zero, the quantifier is ignored.
1851    The assertion is obeyed just once when encountered during matching.
1852  .  .
1853  .  .
1854  .SS "Lookahead assertions"  .SS "Lookahead assertions"
# Line 2586  indicates which of the two alternatives Line 2600  indicates which of the two alternatives
2600  of obtaining this information than putting each alternative in its own  of obtaining this information than putting each alternative in its own
2601  capturing parentheses.  capturing parentheses.
2602  .P  .P
2603  If (*MARK) is encountered in a positive assertion, its name is recorded and  If (*MARK) is encountered in a positive assertion, its name is recorded and
2604  passed back if it is the last-encountered. This does not happen for negative  passed back if it is the last-encountered. This does not happen for negative
2605  assetions.  assetions.
2606  .P  .P
2607  A name may also be returned after a failed match if the final path through the  A name may also be returned after a failed match if the final path through the
# Line 2761  Cambridge CB2 3QH, England. Line 2775  Cambridge CB2 3QH, England.
2775  .rs  .rs
2776  .sp  .sp
2777  .nf  .nf
2778  Last updated: 22 July 2011  Last updated: 24 August 2011
2779  Copyright (c) 1997-2011 University of Cambridge.  Copyright (c) 1997-2011 University of Cambridge.
2780  .fi  .fi

Legend:
Removed from v.630  
changed lines
  Added in v.678

  ViewVC Help
Powered by ViewVC 1.1.5