/[pcre]/code/trunk/doc/html/pcreapi.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcreapi.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 547 by ph10, Wed Jun 16 10:51:15 2010 UTC revision 548 by ph10, Fri Jun 25 14:42:00 2010 UTC
# Line 762  out of use. To avoid confusion, they hav Line 762  out of use. To avoid confusion, they hav
762    50  [this code is not in use]    50  [this code is not in use]
763    51  octal value is greater than \377 (not in UTF-8 mode)    51  octal value is greater than \377 (not in UTF-8 mode)
764    52  internal error: overran compiling workspace    52  internal error: overran compiling workspace
765    53  internal error: previously-checked referenced subpattern not found    53  internal error: previously-checked referenced subpattern
766            not found
767    54  DEFINE group contains more than one branch    54  DEFINE group contains more than one branch
768    55  repeating a DEFINE group is not allowed    55  repeating a DEFINE group is not allowed
769    56  inconsistent NEWLINE options    56  inconsistent NEWLINE options
# Line 775  out of use. To avoid confusion, they hav Line 776  out of use. To avoid confusion, they hav
776    62  subpattern name expected    62  subpattern name expected
777    63  digit expected after (?+    63  digit expected after (?+
778    64  ] is an invalid data character in JavaScript compatibility mode    64  ] is an invalid data character in JavaScript compatibility mode
779    65  different names for subpatterns of the same number are not allowed    65  different names for subpatterns of the same number are
780            not allowed
781    66  (*MARK) must have an argument    66  (*MARK) must have an argument
782    67  this version of PCRE is not compiled with PCRE_UCP support    67  this version of PCRE is not compiled with PCRE_UCP support
783  </pre>  </pre>
# Line 844  Studying a pattern is also useful for no Line 846  Studying a pattern is also useful for no
846  single fixed starting character. A bitmap of possible starting bytes is  single fixed starting character. A bitmap of possible starting bytes is
847  created. This speeds up finding a position in the subject at which to start  created. This speeds up finding a position in the subject at which to start
848  matching.  matching.
849    </P>
850    <P>
851    The two optimizations just described can be disabled by setting the
852    PCRE_NO_START_OPTIMIZE option when calling <b>pcre_exec()</b> or
853    <b>pcre_dfa_exec()</b>. You might want to do this if your pattern contains
854    callouts, or make use of (*MARK), and you make use of these in cases where
855    matching fails. See the discussion of PCRE_NO_START_OPTIMIZE
856    <a href="#execoptions">below.</a>
857  <a name="localesupport"></a></P>  <a name="localesupport"></a></P>
858  <br><a name="SEC10" href="#TOC1">LOCALE SUPPORT</a><br>  <br><a name="SEC10" href="#TOC1">LOCALE SUPPORT</a><br>
859  <P>  <P>
# Line 1443  unanchored match must start with a speci Line 1453  unanchored match must start with a speci
1453  for that character, and fails immediately if it cannot find it, without  for that character, and fails immediately if it cannot find it, without
1454  actually running the main matching function. This means that a special item  actually running the main matching function. This means that a special item
1455  such as (*COMMIT) at the start of a pattern is not considered until after a  such as (*COMMIT) at the start of a pattern is not considered until after a
1456  suitable starting point for the match has been found. When callouts are in use,  suitable starting point for the match has been found. When callouts or (*MARK)
1457  these "start-up" optimizations can cause them to be skipped if the pattern is  items are in use, these "start-up" optimizations can cause them to be skipped
1458  never actually used. The PCRE_NO_START_OPTIMIZE option disables the start-up  if the pattern is never actually used. The start-up optimizations are in effect
1459  optimizations, causing performance to suffer, but ensuring that the callouts do  a pre-scan of the subject that takes place before the pattern is run.
1460  occur, and that items such as (*COMMIT) are considered at every possible  </P>
1461  starting position in the subject string.  <P>
1462    The PCRE_NO_START_OPTIMIZE option disables the start-up optimizations, possibly
1463    causing performance to suffer, but ensuring that in cases where the result is
1464    "no match", the callouts do occur, and that items such as (*COMMIT) and (*MARK)
1465    are considered at every possible starting position in the subject string.
1466    Setting PCRE_NO_START_OPTIMIZE can change the outcome of a matching operation.
1467    Consider the pattern
1468    <pre>
1469      (*COMMIT)ABC
1470    </pre>
1471    When this is compiled, PCRE records the fact that a match must start with the
1472    character "A". Suppose the subject string is "DEFABC". The start-up
1473    optimization scans along the subject, finds "A" and runs the first match
1474    attempt from there. The (*COMMIT) item means that the pattern must match the
1475    current starting position, which in this case, it does. However, if the same
1476    match is run with PCRE_NO_START_OPTIMIZE set, the initial scan along the
1477    subject string does not happen. The first match attempt is run starting from
1478    "D" and when this fails, (*COMMIT) prevents any further matches being tried, so
1479    the overall result is "no match". If the pattern is studied, more start-up
1480    optimizations may be used. For example, a minimum length for the subject may be
1481    recorded. Consider the pattern
1482    <pre>
1483      (*MARK:A)(X|Y)
1484    </pre>
1485    The minimum length for a match is one character. If the subject is "ABC", there
1486    will be attempts to match "ABC", "BC", "C", and then finally an empty string.
1487    If the pattern is studied, the final attempt does not take place, because PCRE
1488    knows that the subject is too short, and so the (*MARK) is never encountered.
1489    In this case, studying the pattern does not affect the overall match result,
1490    which is still "no match", but it does affect the auxiliary information that is
1491    returned.
1492  <pre>  <pre>
1493    PCRE_NO_UTF8_CHECK    PCRE_NO_UTF8_CHECK
1494  </pre>  </pre>
# Line 2121  Cambridge CB2 3QH, England. Line 2161  Cambridge CB2 3QH, England.
2161  </P>  </P>
2162  <br><a name="SEC22" href="#TOC1">REVISION</a><br>  <br><a name="SEC22" href="#TOC1">REVISION</a><br>
2163  <P>  <P>
2164  Last updated: 15 June 2010  Last updated: 21 June 2010
2165  <br>  <br>
2166  Copyright &copy; 1997-2010 University of Cambridge.  Copyright &copy; 1997-2010 University of Cambridge.
2167  <br>  <br>

Legend:
Removed from v.547  
changed lines
  Added in v.548

  ViewVC Help
Powered by ViewVC 1.1.5