/[pcre]/code/trunk/doc/html/pcreapi.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcreapi.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 72 by nigel, Sat Feb 24 21:40:24 2007 UTC revision 73 by nigel, Sat Feb 24 21:40:30 2007 UTC
# Line 98  conversion went wrong.<br> Line 98  conversion went wrong.<br>
98  <b>void (*pcre_free)(void *);</b>  <b>void (*pcre_free)(void *);</b>
99  </P>  </P>
100  <P>  <P>
101    <b>void *(*pcre_stack_malloc)(size_t);</b>
102    </P>
103    <P>
104    <b>void (*pcre_stack_free)(void *);</b>
105    </P>
106    <P>
107  <b>int (*pcre_callout)(pcre_callout_block *);</b>  <b>int (*pcre_callout)(pcre_callout_block *);</b>
108  </P>  </P>
109  <br><a name="SEC2" href="#TOC1">PCRE API</a><br>  <br><a name="SEC2" href="#TOC1">PCRE API</a><br>
# Line 156  so a calling program can replace them if Line 162  so a calling program can replace them if
162  should be done before calling any PCRE functions.  should be done before calling any PCRE functions.
163  </P>  </P>
164  <P>  <P>
165    The global variables <b>pcre_stack_malloc</b> and <b>pcre_stack_free</b> are also
166    indirections to memory management functions. These special functions are used
167    only when PCRE is compiled to use the heap for remembering data, instead of
168    recursive function calls. This is a non-standard way of building PCRE, for use
169    in environments that have limited stacks. Because of the greater use of memory
170    management, it runs more slowly. Separate functions are provided so that
171    special-purpose external code can be used for this case. When used, these
172    functions are always called in a stack-like manner (last obtained, first
173    freed), and always for memory blocks of the same size.
174    </P>
175    <P>
176  The global variable <b>pcre_callout</b> initially contains NULL. It can be set  The global variable <b>pcre_callout</b> initially contains NULL. It can be set
177  by the caller to a "callout" function, which PCRE will then call at specified  by the caller to a "callout" function, which PCRE will then call at specified
178  points during a matching operation. Details are given in the <b>pcrecallout</b>  points during a matching operation. Details are given in the <b>pcrecallout</b>
# Line 164  documentation. Line 181  documentation.
181  <br><a name="SEC3" href="#TOC1">MULTITHREADING</a><br>  <br><a name="SEC3" href="#TOC1">MULTITHREADING</a><br>
182  <P>  <P>
183  The PCRE functions can be used in multi-threading applications, with the  The PCRE functions can be used in multi-threading applications, with the
184  proviso that the memory management functions pointed to by <b>pcre_malloc</b>  proviso that the memory management functions pointed to by <b>pcre_malloc</b>,
185  and <b>pcre_free</b>, and the callout function pointed to by <b>pcre_callout</b>,  <b>pcre_free</b>, <b>pcre_stack_malloc</b>, and <b>pcre_stack_free</b>, and the
186  are shared by all threads.  callout function pointed to by <b>pcre_callout</b>, are shared by all threads.
187  </P>  </P>
188  <P>  <P>
189  The compiled form of a regular expression is not altered during matching, so  The compiled form of a regular expression is not altered during matching, so
# Line 238  The output is an integer that gives the Line 255  The output is an integer that gives the
255  internal matching function calls in a <b>pcre_exec()</b> execution. Further  internal matching function calls in a <b>pcre_exec()</b> execution. Further
256  details are given with <b>pcre_exec()</b> below.  details are given with <b>pcre_exec()</b> below.
257  </P>  </P>
258    <P>
259    <pre>
260      PCRE_CONFIG_STACKRECURSE
261    </PRE>
262    </P>
263    <P>
264    The output is an integer that is set to one if internal recursion is
265    implemented by recursive function calls that use the stack to remember their
266    state. This is the usual way that PCRE is compiled. The output is zero if PCRE
267    was compiled to use blocks of data on the heap instead of recursive function
268    calls. In this case, <b>pcre_stack_malloc</b> and <b>pcre_stack_free</b> are
269    called to manage memory blocks on the heap, thus avoiding the use of the stack.
270    </P>
271  <br><a name="SEC5" href="#TOC1">COMPILING A PATTERN</a><br>  <br><a name="SEC5" href="#TOC1">COMPILING A PATTERN</a><br>
272  <P>  <P>
273  <b>pcre *pcre_compile(const char *<i>pattern</i>, int <i>options</i>,</b>  <b>pcre *pcre_compile(const char *<i>pattern</i>, int <i>options</i>,</b>
# Line 878  unachored at matching time. Line 908  unachored at matching time.
908  </P>  </P>
909  <P>  <P>
910  When PCRE_UTF8 was set at compile time, the validity of the subject as a UTF-8  When PCRE_UTF8 was set at compile time, the validity of the subject as a UTF-8
911  string is automatically checked. If an invalid UTF-8 sequence of bytes is  string is automatically checked, and the value of <i>startoffset</i> is also
912  found, <b>pcre_exec()</b> returns the error PCRE_ERROR_BADUTF8. If you already  checked to ensure that it points to the start of a UTF-8 character. If an
913  know that your subject is valid, and you want to skip this check for  invalid UTF-8 sequence of bytes is found, <b>pcre_exec()</b> returns the error
914  performance reasons, you can set the PCRE_NO_UTF8_CHECK option when calling  PCRE_ERROR_BADUTF8. If <i>startoffset</i> contains an invalid value,
915  <b>pcre_exec()</b>. When this option is set, the effect of passing an invalid  PCRE_ERROR_BADUTF8_OFFSET is returned.
916  UTF-8 string as a subject is undefined. It may cause your program to crash.  </P>
917    <P>
918    If you already know that your subject is valid, and you want to skip these
919    checks for performance reasons, you can set the PCRE_NO_UTF8_CHECK option when
920    calling <b>pcre_exec()</b>. You might want to do this for the second and
921    subsequent calls to <b>pcre_exec()</b> if you are making repeated calls to find
922    all the matches in a single subject string. However, you should be sure that
923    the value of <i>startoffset</i> points to the start of a UTF-8 character. When
924    PCRE_NO_UTF8_CHECK is set, the effect of passing an invalid UTF-8 string as a
925    subject, or a value of <i>startoffset</i> that does not point to the start of a
926    UTF-8 character, is undefined. Your program may crash.
927  </P>  </P>
928  <P>  <P>
929  There are also three further options that can be set only at matching time:  There are also three further options that can be set only at matching time:
# Line 939  below) and trying an ordinary match agai Line 979  below) and trying an ordinary match agai
979  </P>  </P>
980  <P>  <P>
981  The subject string is passed to <b>pcre_exec()</b> as a pointer in  The subject string is passed to <b>pcre_exec()</b> as a pointer in
982  <i>subject</i>, a length in <i>length</i>, and a starting offset in  <i>subject</i>, a length in <i>length</i>, and a starting byte offset in
983  <i>startoffset</i>. Unlike the pattern string, the subject may contain binary  <i>startoffset</i>. Unlike the pattern string, the subject may contain binary
984  zero bytes. When the starting offset is zero, the search for a match starts at  zero bytes. When the starting offset is zero, the search for a match starts at
985  the beginning of the subject, and this is by far the most common case.  the beginning of the subject, and this is by far the most common case.
986  </P>  </P>
987  <P>  <P>
988  If the pattern was compiled with the PCRE_UTF8 option, the subject must be a  If the pattern was compiled with the PCRE_UTF8 option, the subject must be a
989  sequence of bytes that is a valid UTF-8 string. If an invalid UTF-8 string is  sequence of bytes that is a valid UTF-8 string, and the starting offset must
990  passed, PCRE's behaviour is not defined.  point to the beginning of a UTF-8 character. If an invalid UTF-8 string or
991    offset is passed, an error (either PCRE_ERROR_BADUTF8 or
992    PCRE_ERROR_BADUTF8_OFFSET) is returned, unless the option PCRE_NO_UTF8_CHECK is
993    set, in which case PCRE's behaviour is not defined.
994  </P>  </P>
995  <P>  <P>
996  A non-zero starting offset is useful when searching for another match in the  A non-zero starting offset is useful when searching for another match in the
# Line 1132  use by callout functions that want to yi Line 1175  use by callout functions that want to yi
1175  </P>  </P>
1176  <P>  <P>
1177  <pre>  <pre>
1178    PCRE_ERROR_BADUTF8       (-10)    PCRE_ERROR_BADUTF8        (-10)
1179  </PRE>  </PRE>
1180  </P>  </P>
1181  <P>  <P>
1182  A string that contains an invalid UTF-8 byte sequence was passed as a subject.  A string that contains an invalid UTF-8 byte sequence was passed as a subject.
1183  </P>  </P>
1184    <P>
1185    <pre>
1186      PCRE_ERROR_BADUTF8_OFFSET (-11)
1187    </PRE>
1188    </P>
1189    <P>
1190    The UTF-8 byte sequence that was passed as a subject was valid, but the value
1191    of <i>startoffset</i> did not point to the beginning of a UTF-8 character.
1192    </P>
1193  <br><a name="SEC11" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br>  <br><a name="SEC11" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br>
1194  <P>  <P>
1195  <b>int pcre_copy_substring(const char *<i>subject</i>, int *<i>ovector</i>,</b>  <b>int pcre_copy_substring(const char *<i>subject</i>, int *<i>ovector</i>,</b>
# Line 1289  then call <i>pcre_copy_substring()</i> o Line 1341  then call <i>pcre_copy_substring()</i> o
1341  appropriate.  appropriate.
1342  </P>  </P>
1343  <P>  <P>
1344  Last updated: 20 August 2003  Last updated: 09 December 2003
1345  <br>  <br>
1346  Copyright &copy; 1997-2003 University of Cambridge.  Copyright &copy; 1997-2003 University of Cambridge.

Legend:
Removed from v.72  
changed lines
  Added in v.73

  ViewVC Help
Powered by ViewVC 1.1.5