/[pcre]/code/trunk/doc/html/pcreapi.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcreapi.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1319 by ph10, Fri Mar 22 16:13:13 2013 UTC revision 1320 by ph10, Wed May 1 16:39:35 2013 UTC
# Line 756  equivalent to Perl's /m option, and it c Line 756  equivalent to Perl's /m option, and it c
756  (?m) option setting. If there are no newlines in a subject string, or no  (?m) option setting. If there are no newlines in a subject string, or no
757  occurrences of ^ or $ in a pattern, setting PCRE_MULTILINE has no effect.  occurrences of ^ or $ in a pattern, setting PCRE_MULTILINE has no effect.
758  <pre>  <pre>
759      PCRE_NEVER_UTF
760    </pre>
761    This option locks out interpretation of the pattern as UTF-8 (or UTF-16 or
762    UTF-32 in the 16-bit and 32-bit libraries). In particular, it prevents the
763    creator of the pattern from switching to UTF interpretation by starting the
764    pattern with (*UTF). This may be useful in applications that process patterns
765    from external sources. The combination of PCRE_UTF8 and PCRE_NEVER_UTF also
766    causes an error.
767    <pre>
768    PCRE_NEWLINE_CR    PCRE_NEWLINE_CR
769    PCRE_NEWLINE_LF    PCRE_NEWLINE_LF
770    PCRE_NEWLINE_CRLF    PCRE_NEWLINE_CRLF
# Line 814  were followed by ?: but named parenthese Line 823  were followed by ?: but named parenthese
823  they acquire numbers in the usual way). There is no equivalent of this option  they acquire numbers in the usual way). There is no equivalent of this option
824  in Perl.  in Perl.
825  <pre>  <pre>
826    NO_START_OPTIMIZE    PCRE_NO_START_OPTIMIZE
827  </pre>  </pre>
828  This is an option that acts at matching time; that is, it is really an option  This is an option that acts at matching time; that is, it is really an option
829  for <b>pcre_exec()</b> or <b>pcre_dfa_exec()</b>. If it is set at compile time,  for <b>pcre_exec()</b> or <b>pcre_dfa_exec()</b>. If it is set at compile time,
830  it is remembered with the compiled pattern and assumed at matching time. For  it is remembered with the compiled pattern and assumed at matching time. This
831  details see the discussion of PCRE_NO_START_OPTIMIZE  is necessary if you want to use JIT execution, because the JIT compiler needs
832    to know whether or not this option is set. For details see the discussion of
833    PCRE_NO_START_OPTIMIZE
834  <a href="#execoptions">below.</a>  <a href="#execoptions">below.</a>
835  <pre>  <pre>
836    PCRE_UCP    PCRE_UCP
# Line 938  have fallen out of use. To avoid confusi Line 949  have fallen out of use. To avoid confusi
949          name/number or by a plain number          name/number or by a plain number
950    58  a numbered reference must not be zero    58  a numbered reference must not be zero
951    59  an argument is not allowed for (*ACCEPT), (*FAIL), or (*COMMIT)    59  an argument is not allowed for (*ACCEPT), (*FAIL), or (*COMMIT)
952    60  (*VERB) not recognized    60  (*VERB) not recognized or malformed
953    61  number is too big    61  number is too big
954    62  subpattern name expected    62  subpattern name expected
955    63  digit expected after (?+    63  digit expected after (?+
# Line 1069  In 32-bit mode, the bitmap is used for 3 Line 1080  In 32-bit mode, the bitmap is used for 3
1080  <P>  <P>
1081  These two optimizations apply to both <b>pcre_exec()</b> and  These two optimizations apply to both <b>pcre_exec()</b> and
1082  <b>pcre_dfa_exec()</b>, and the information is also used by the JIT compiler.  <b>pcre_dfa_exec()</b>, and the information is also used by the JIT compiler.
1083  The optimizations can be disabled by setting the PCRE_NO_START_OPTIMIZE option  The optimizations can be disabled by setting the PCRE_NO_START_OPTIMIZE option.
1084  when calling <b>pcre_exec()</b> or <b>pcre_dfa_exec()</b>, but if this is done,  You might want to do this if your pattern contains callouts or (*MARK) and you
1085  JIT execution is also disabled. You might want to do this if your pattern  want to make use of these facilities in cases where matching fails.
1086  contains callouts or (*MARK) and you want to make use of these facilities in  </P>
1087  cases where matching fails. See the discussion of PCRE_NO_START_OPTIMIZE  <P>
1088    PCRE_NO_START_OPTIMIZE can be specified at either compile time or execution
1089    time. However, if PCRE_NO_START_OPTIMIZE is passed to <b>pcre_exec()</b>, (that
1090    is, after any JIT compilation has happened) JIT execution is disabled. For JIT
1091    execution to work with PCRE_NO_START_OPTIMIZE, the option must be set at
1092    compile time.
1093    </P>
1094    <P>
1095    There is a longer discussion of PCRE_NO_START_OPTIMIZE
1096  <a href="#execoptions">below.</a>  <a href="#execoptions">below.</a>
1097  <a name="localesupport"></a></P>  <a name="localesupport"></a></P>
1098  <br><a name="SEC14" href="#TOC1">LOCALE SUPPORT</a><br>  <br><a name="SEC14" href="#TOC1">LOCALE SUPPORT</a><br>
# Line 1162  the following negative numbers: Line 1181  the following negative numbers:
1181    PCRE_ERROR_BADENDIANNESS  the pattern was compiled with different    PCRE_ERROR_BADENDIANNESS  the pattern was compiled with different
1182                              endianness                              endianness
1183    PCRE_ERROR_BADOPTION      the value of <i>what</i> was invalid    PCRE_ERROR_BADOPTION      the value of <i>what</i> was invalid
1184      PCRE_ERROR_UNSET          the requested field is not set
1185  </pre>  </pre>
1186  The "magic number" is placed at the start of each compiled pattern as an simple  The "magic number" is placed at the start of each compiled pattern as an simple
1187  check against passing an arbitrary memory pointer. The endianness error can  check against passing an arbitrary memory pointer. The endianness error can
# Line 1285  to return the full 32-bit range of the c Line 1305  to return the full 32-bit range of the c
1305  instead the PCRE_INFO_REQUIREDCHARFLAGS and PCRE_INFO_REQUIREDCHAR values should  instead the PCRE_INFO_REQUIREDCHARFLAGS and PCRE_INFO_REQUIREDCHAR values should
1306  be used.  be used.
1307  <pre>  <pre>
1308      PCRE_INFO_MATCHLIMIT
1309    </pre>
1310    If the pattern set a match limit by including an item of the form
1311    (*LIMIT_MATCH=nnnn) at the start, the value is returned. The fourth argument
1312    should point to an unsigned 32-bit integer. If no such value has been set, the
1313    call to <b>pcre_fullinfo()</b> returns the error PCRE_ERROR_UNSET.
1314    <pre>
1315    PCRE_INFO_MAXLOOKBEHIND    PCRE_INFO_MAXLOOKBEHIND
1316  </pre>  </pre>
1317  Return the number of characters (NB not bytes) in the longest lookbehind  Return the number of characters (NB not bytes) in the longest lookbehind
# Line 1293  matching using the partial matching faci Line 1320  matching using the partial matching faci
1320  \b and \B require a one-character lookbehind. \A also registers a  \b and \B require a one-character lookbehind. \A also registers a
1321  one-character lookbehind, though it does not actually inspect the previous  one-character lookbehind, though it does not actually inspect the previous
1322  character. This is to ensure that at least one character from the old segment  character. This is to ensure that at least one character from the old segment
1323  is retained when a new segment is processed. Otherwise, if there are no  is retained when a new segment is processed. Otherwise, if there are no
1324  lookbehinds in the pattern, \A might match incorrectly at the start of a new  lookbehinds in the pattern, \A might match incorrectly at the start of a new
1325  segment.  segment.
1326  <pre>  <pre>
1327    PCRE_INFO_MINLENGTH    PCRE_INFO_MINLENGTH
# Line 1397  alternatives begin with one of the follo Line 1424  alternatives begin with one of the follo
1424  For such patterns, the PCRE_ANCHORED bit is set in the options returned by  For such patterns, the PCRE_ANCHORED bit is set in the options returned by
1425  <b>pcre_fullinfo()</b>.  <b>pcre_fullinfo()</b>.
1426  <pre>  <pre>
1427      PCRE_INFO_RECURSIONLIMIT
1428    </pre>
1429    If the pattern set a recursion limit by including an item of the form
1430    (*LIMIT_RECURSION=nnnn) at the start, the value is returned. The fourth
1431    argument should point to an unsigned 32-bit integer. If no such value has been
1432    set, the call to <b>pcre_fullinfo()</b> returns the error PCRE_ERROR_UNSET.
1433    <pre>
1434    PCRE_INFO_SIZE    PCRE_INFO_SIZE
1435  </pre>  </pre>
1436  Return the size of the compiled pattern in bytes (for both libraries). The  Return the size of the compiled pattern in bytes (for both libraries). The
# Line 1639  the <i>flags</i> field. If the limit is Line 1673  the <i>flags</i> field. If the limit is
1673  PCRE_ERROR_MATCHLIMIT.  PCRE_ERROR_MATCHLIMIT.
1674  </P>  </P>
1675  <P>  <P>
1676    A value for the match limit may also be supplied by an item at the start of a
1677    pattern of the form
1678    <pre>
1679      (*LIMIT_MATCH=d)
1680    </pre>
1681    where d is a decimal number. However, such a setting is ignored unless d is
1682    less than the limit set by the caller of <b>pcre_exec()</b> or, if no such limit
1683    is set, less than the default.
1684    </P>
1685    <P>
1686  The <i>match_limit_recursion</i> field is similar to <i>match_limit</i>, but  The <i>match_limit_recursion</i> field is similar to <i>match_limit</i>, but
1687  instead of limiting the total number of times that <b>match()</b> is called, it  instead of limiting the total number of times that <b>match()</b> is called, it
1688  limits the depth of recursion. The recursion depth is a smaller number than the  limits the depth of recursion. The recursion depth is a smaller number than the
# Line 1660  PCRE_EXTRA_MATCH_LIMIT_RECURSION is set Line 1704  PCRE_EXTRA_MATCH_LIMIT_RECURSION is set
1704  is exceeded, <b>pcre_exec()</b> returns PCRE_ERROR_RECURSIONLIMIT.  is exceeded, <b>pcre_exec()</b> returns PCRE_ERROR_RECURSIONLIMIT.
1705  </P>  </P>
1706  <P>  <P>
1707    A value for the recursion limit may also be supplied by an item at the start of
1708    a pattern of the form
1709    <pre>
1710      (*LIMIT_RECURSION=d)
1711    </pre>
1712    where d is a decimal number. However, such a setting is ignored unless d is
1713    less than the limit set by the caller of <b>pcre_exec()</b> or, if no such limit
1714    is set, less than the default.
1715    </P>
1716    <P>
1717  The <i>callout_data</i> field is used in conjunction with the "callout" feature,  The <i>callout_data</i> field is used in conjunction with the "callout" feature,
1718  and is described in the  and is described in the
1719  <a href="pcrecallout.html"><b>pcrecallout</b></a>  <a href="pcrecallout.html"><b>pcrecallout</b></a>
# Line 1821  unanchored match must start with a speci Line 1875  unanchored match must start with a speci
1875  for that character, and fails immediately if it cannot find it, without  for that character, and fails immediately if it cannot find it, without
1876  actually running the main matching function. This means that a special item  actually running the main matching function. This means that a special item
1877  such as (*COMMIT) at the start of a pattern is not considered until after a  such as (*COMMIT) at the start of a pattern is not considered until after a
1878  suitable starting point for the match has been found. When callouts or (*MARK)  suitable starting point for the match has been found. Also, when callouts or
1879  items are in use, these "start-up" optimizations can cause them to be skipped  (*MARK) items are in use, these "start-up" optimizations can cause them to be
1880  if the pattern is never actually used. The start-up optimizations are in effect  skipped if the pattern is never actually used. The start-up optimizations are
1881  a pre-scan of the subject that takes place before the pattern is run.  in effect a pre-scan of the subject that takes place before the pattern is run.
1882  </P>  </P>
1883  <P>  <P>
1884  The PCRE_NO_START_OPTIMIZE option disables the start-up optimizations, possibly  The PCRE_NO_START_OPTIMIZE option disables the start-up optimizations, possibly
# Line 1832  causing performance to suffer, but ensur Line 1886  causing performance to suffer, but ensur
1886  "no match", the callouts do occur, and that items such as (*COMMIT) and (*MARK)  "no match", the callouts do occur, and that items such as (*COMMIT) and (*MARK)
1887  are considered at every possible starting position in the subject string. If  are considered at every possible starting position in the subject string. If
1888  PCRE_NO_START_OPTIMIZE is set at compile time, it cannot be unset at matching  PCRE_NO_START_OPTIMIZE is set at compile time, it cannot be unset at matching
1889  time. The use of PCRE_NO_START_OPTIMIZE disables JIT execution; when it is set,  time. The use of PCRE_NO_START_OPTIMIZE at matching time (that is, passing it
1890  matching is always done using interpretively.  to <b>pcre_exec()</b>) disables JIT execution; in this situation, matching is
1891    always done using interpretively.
1892  </P>  </P>
1893  <P>  <P>
1894  Setting PCRE_NO_START_OPTIMIZE can change the outcome of a matching operation.  Setting PCRE_NO_START_OPTIMIZE can change the outcome of a matching operation.
# Line 2340  never occur in a valid UTF-8 string. Line 2395  never occur in a valid UTF-8 string.
2395    PCRE_UTF8_ERR22    PCRE_UTF8_ERR22
2396  </pre>  </pre>
2397  This error code was formerly used when the presence of a so-called  This error code was formerly used when the presence of a so-called
2398  "non-character" caused an error. Unicode corrigendum #9 makes it clear that  "non-character" caused an error. Unicode corrigendum #9 makes it clear that
2399  such characters should not cause a string to be rejected, and so this code is  such characters should not cause a string to be rejected, and so this code is
2400  no longer in use and is never returned.  no longer in use and is never returned.
2401  </P>  </P>
2402  <br><a name="SEC18" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br>  <br><a name="SEC18" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br>
# Line 2784  Cambridge CB2 3QH, England. Line 2839  Cambridge CB2 3QH, England.
2839  </P>  </P>
2840  <br><a name="SEC26" href="#TOC1">REVISION</a><br>  <br><a name="SEC26" href="#TOC1">REVISION</a><br>
2841  <P>  <P>
2842  Last updated: 27 February 2013  Last updated: 26 April 2013
2843  <br>  <br>
2844  Copyright &copy; 1997-2013 University of Cambridge.  Copyright &copy; 1997-2013 University of Cambridge.
2845  <br>  <br>

Legend:
Removed from v.1319  
changed lines
  Added in v.1320

  ViewVC Help
Powered by ViewVC 1.1.5