--- code/trunk/doc/html/pcreapi.html 2012/01/21 15:59:35 902 +++ code/trunk/doc/html/pcreapi.html 2012/01/21 16:37:17 903 @@ -34,10 +34,11 @@
#include <pcre.h> @@ -174,7 +175,7 @@ start with pcre16_ instead of pcre_. For every option that has UTF8 in its name (for example, PCRE_UTF8), there is a corresponding 16-bit name with UTF8 replaced by UTF16. This facility is in fact just cosmetic; the 16-bit -option names define the same bit values. +option names define the same bit values.
References to bytes and UTF-8 in this document should be read as references to @@ -182,7 +183,7 @@ specified otherwise. More details of the specific differences for the 16-bit library are given in the pcre16 -page. +page.
@@ -397,7 +398,7 @@ PCRE_CONFIG_UTF8 The output is an integer that is set to one if UTF-8 support is available; -otherwise it is set to zero. If this option is given to the 16-bit version of +otherwise it is set to zero. If this option is given to the 16-bit version of this function, pcre16_config(), the result is PCRE_ERROR_BADOPTION.
PCRE_CONFIG_UTF16 @@ -417,6 +418,13 @@ The output is an integer that is set to one if support for just-in-time compiling is available; otherwise it is set to zero.The "magic number" is placed at the start of each compiled pattern as an simple -check against passing an arbitrary memory pointer. The endianness error can +check against passing an arbitrary memory pointer. The endianness error can occur if a compiled pattern is saved and reloaded on a different host. Here is a typical call of pcre_fullinfo(), to obtain the length of the compiled pattern: @@ -1150,8 +1158,8 @@+ PCRE_CONFIG_JITTARGET ++The output is a pointer to a zero-terminated "const char *" string. If JIT +support is available, the string contains the name of the architecture for +which the JIT compiler is configured, for example "x86 32bit (little endian + +unaligned)". If JIT support is not available, the result is NULL. +PCRE_CONFIG_NEWLINEThe output is an integer whose value specifies the default character sequence @@ -738,7 +746,7 @@ that any Unicode newline sequence should be recognized. The Unicode newline sequences are the three just mentioned, plus the single characters VT (vertical tab, U+000B), FF (formfeed, U+000C), NEL (next line, U+0085), LS (line -separator, U+2028), and PS (paragraph separator, U+2029). For the 8-bit +separator, U+2028), and PS (paragraph separator, U+2029). For the 8-bit library, the last two are recognized only in UTF-8 mode.
@@ -808,7 +816,7 @@PCRE_NO_UTF8_CHECK-When PCRE_UTF8 is set, the validity of the pattern as a UTF-8 +When PCRE_UTF8 is set, the validity of the pattern as a UTF-8 string is automatically checked. There is a discussion about the validity of UTF-8 strings in the @@ -825,7 +833,7 @@
The following table lists the error codes than may be returned by pcre_compile2(), along with the error messages that may be returned by -both compiling functions. Note that error messages are always 8-bit ASCII +both compiling functions. Note that error messages are always 8-bit ASCII strings, even in 16-bit mode. As PCRE has developed, some error codes have fallen out of use. To avoid confusion, they have not been re-used.@@ -899,14 +907,14 @@ 65 different names for subpatterns of the same number are not allowed 66 (*MARK) must have an argument - 67 this version of PCRE is not compiled with Unicode property + 67 this version of PCRE is not compiled with Unicode property support 68 \c must be followed by an ASCII character 69 \k is not followed by a braced, angle-bracketed, or quoted name 70 internal error: unknown opcode in find_fixedlength() 71 \N is not supported in a class 72 too many forward references - 73 disallowed Unicode code point (>= 0xd800 && <= 0xdfff) + 73 disallowed Unicode code point (>= 0xd800 && <= 0xdfff) 74 invalid UTF-16 string (specifically UTF-16)The numbers 32 and 10000 in errors 48 and 49 are defaults; different values may @@ -1101,12 +1109,12 @@ PCRE_ERROR_NULL the argument code was NULL the argument where was NULL PCRE_ERROR_BADMAGIC the "magic number" was not found - PCRE_ERROR_BADENDIANNESS the pattern was compiled with different + PCRE_ERROR_BADENDIANNESS the pattern was compiled with different endianness PCRE_ERROR_BADOPTION the value of what was invalid
If there is a fixed first value, for example, the letter "c" from a pattern -such as (cat|cow|coyote), its value is returned. In the 8-bit library, the -value is always less than 256; in the 16-bit library the value can be up to +such as (cat|cow|coyote), its value is returned. In the 8-bit library, the +value is always less than 256; in the 16-bit library the value can be up to 0xffff.
@@ -1427,7 +1435,7 @@ const unsigned char *tables; unsigned char **mark; -In the 16-bit version of this structure, the mark field has type +In the 16-bit version of this structure, the mark field has type "PCRE_UCHAR16 **".
@@ -2067,14 +2075,14 @@
PCRE_ERROR_BADMODE (-28)-This error is given if a pattern that was compiled by the 8-bit library is +This error is given if a pattern that was compiled by the 8-bit library is passed to a 16-bit library function, or vice versa.
PCRE_ERROR_BADENDIANNESS (-29)-This error is given if a pattern that was compiled and saved is reloaded on a -host with different endianness. The utility function -pcre_pattern_to_host_byte_order() can be used to convert such a pattern +This error is given if a pattern that was compiled and saved is reloaded on a +host with different endianness. The utility function +pcre_pattern_to_host_byte_order() can be used to convert such a pattern so that it runs on the new host.
@@ -2084,7 +2092,7 @@
Reason codes for invalid UTF-8 strings
-This section applies only to the 8-bit library. The corresponding information +This section applies only to the 8-bit library. The corresponding information for the 16-bit library is given in the pcre16 page. @@ -2374,8 +2382,32 @@ substring. Then return 1, which forces pcre_exec() to backtrack and try other alternatives. Ultimately, when it runs out of matches, pcre_exec() will yield PCRE_ERROR_NOMATCH. ++
+Matching certain patterns using pcre_exec() can use a lot of process +stack, which in certain environments can be rather limited in size. Some users +find it helpful to have an estimate of the amount of stack that is used by +pcre_exec(), to help them set recursion limits, as described in the +pcrestack +documentation. The estimate that is output by pcretest when called with +the -m and -C options is obtained by calling pcre_exec with +the values NULL, NULL, NULL, -999, and -999 for its first five arguments. ++
+Normally, if its first argument is NULL, pcre_exec() immediately returns +the negative error code PCRE_ERROR_NULL, but with this special combination of +arguments, it returns instead a negative number whose absolute value is the +approximate stack frame size in bytes. (A negative number is used so that it is +clear that no match has happened.) The value is approximate because in some +cases, recursive calls to pcre_exec() occur when there are one or two +additional variables on the stack. ++
+If PCRE has been compiled to use the heap instead of the stack for recursion, +the value returned is the size of each block that is obtained from the heap.-
int pcre_dfa_exec(const pcre *code, const pcre_extra *extra, const char *subject, int length, int startoffset, @@ -2550,13 +2582,13 @@ error is given if the output vector is not large enough. This should be extremely rare, as a vector of size 1000 is used.-
pcre16(3), pcrebuild(3), pcrecallout(3), pcrecpp(3)(3), pcrematching(3), pcrepartial(3), pcreposix(3), pcreprecompile(3), pcresample(3), pcrestack(3).-
@@ -2565,9 +2597,9 @@ Cambridge CB2 3QH, England.
-Last updated: 07 January 2012
+Last updated: 21 January 2012
Copyright © 1997-2012 University of Cambridge.