/[pcre]/code/trunk/doc/pcreapi.3
ViewVC logotype

Diff of /code/trunk/doc/pcreapi.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 902 by ph10, Sat Jan 21 15:47:59 2012 UTC revision 903 by ph10, Sat Jan 21 16:37:17 2012 UTC
# Line 148  just use different data types for their Line 148  just use different data types for their
148  start with \fBpcre16_\fP instead of \fBpcre_\fP. For every option that has UTF8  start with \fBpcre16_\fP instead of \fBpcre_\fP. For every option that has UTF8
149  in its name (for example, PCRE_UTF8), there is a corresponding 16-bit name with  in its name (for example, PCRE_UTF8), there is a corresponding 16-bit name with
150  UTF8 replaced by UTF16. This facility is in fact just cosmetic; the 16-bit  UTF8 replaced by UTF16. This facility is in fact just cosmetic; the 16-bit
151  option names define the same bit values.  option names define the same bit values.
152  .P  .P
153  References to bytes and UTF-8 in this document should be read as references to  References to bytes and UTF-8 in this document should be read as references to
154  16-bit data quantities and UTF-16 when using the 16-bit library, unless  16-bit data quantities and UTF-16 when using the 16-bit library, unless
# Line 157  library are given in the Line 157  library are given in the
157  .\" HREF  .\" HREF
158  \fBpcre16\fP  \fBpcre16\fP
159  .\"  .\"
160  page.  page.
161  .  .
162  .  .
163  .SH "PCRE API OVERVIEW"  .SH "PCRE API OVERVIEW"
# Line 392  not recognized. The following informatio Line 392  not recognized. The following informatio
392    PCRE_CONFIG_UTF8    PCRE_CONFIG_UTF8
393  .sp  .sp
394  The output is an integer that is set to one if UTF-8 support is available;  The output is an integer that is set to one if UTF-8 support is available;
395  otherwise it is set to zero. If this option is given to the 16-bit version of  otherwise it is set to zero. If this option is given to the 16-bit version of
396  this function, \fBpcre16_config()\fP, the result is PCRE_ERROR_BADOPTION.  this function, \fBpcre16_config()\fP, the result is PCRE_ERROR_BADOPTION.
397  .sp  .sp
398    PCRE_CONFIG_UTF16    PCRE_CONFIG_UTF16
# Line 415  compiling is available; otherwise it is Line 415  compiling is available; otherwise it is
415    PCRE_CONFIG_JITTARGET    PCRE_CONFIG_JITTARGET
416  .sp  .sp
417  The output is a pointer to a zero-terminated "const char *" string. If JIT  The output is a pointer to a zero-terminated "const char *" string. If JIT
418  support is available, the string contains the name of the architecture for  support is available, the string contains the name of the architecture for
419  which the JIT compiler is configured, for example "x86 32bit (little endian +  which the JIT compiler is configured, for example "x86 32bit (little endian +
420  unaligned)". If JIT support is not available, the result is NULL.  unaligned)". If JIT support is not available, the result is NULL.
421  .sp  .sp
422    PCRE_CONFIG_NEWLINE    PCRE_CONFIG_NEWLINE
# Line 742  preceding sequences should be recognized Line 742  preceding sequences should be recognized
742  that any Unicode newline sequence should be recognized. The Unicode newline  that any Unicode newline sequence should be recognized. The Unicode newline
743  sequences are the three just mentioned, plus the single characters VT (vertical  sequences are the three just mentioned, plus the single characters VT (vertical
744  tab, U+000B), FF (formfeed, U+000C), NEL (next line, U+0085), LS (line  tab, U+000B), FF (formfeed, U+000C), NEL (next line, U+0085), LS (line
745  separator, U+2028), and PS (paragraph separator, U+2029). For the 8-bit  separator, U+2028), and PS (paragraph separator, U+2029). For the 8-bit
746  library, the last two are recognized only in UTF-8 mode.  library, the last two are recognized only in UTF-8 mode.
747  .P  .P
748  The newline setting in the options word uses three bits that are treated  The newline setting in the options word uses three bits that are treated
# Line 819  page. Line 819  page.
819  .sp  .sp
820    PCRE_NO_UTF8_CHECK    PCRE_NO_UTF8_CHECK
821  .sp  .sp
822  When PCRE_UTF8 is set, the validity of the pattern as a UTF-8  When PCRE_UTF8 is set, the validity of the pattern as a UTF-8
823  string is automatically checked. There is a discussion about the  string is automatically checked. There is a discussion about the
824  .\" HTML <a href="pcreunicode.html#utf8strings">  .\" HTML <a href="pcreunicode.html#utf8strings">
825  .\" </a>  .\" </a>
826  validity of UTF-8 strings  validity of UTF-8 strings
827  .\"  .\"
828  in the  in the
829  .\" HREF  .\" HREF
# Line 843  validity checking of subject strings. Line 843  validity checking of subject strings.
843  .sp  .sp
844  The following table lists the error codes than may be returned by  The following table lists the error codes than may be returned by
845  \fBpcre_compile2()\fP, along with the error messages that may be returned by  \fBpcre_compile2()\fP, along with the error messages that may be returned by
846  both compiling functions. Note that error messages are always 8-bit ASCII  both compiling functions. Note that error messages are always 8-bit ASCII
847  strings, even in 16-bit mode. As PCRE has developed, some error codes have  strings, even in 16-bit mode. As PCRE has developed, some error codes have
848  fallen out of use. To avoid confusion, they have not been re-used.  fallen out of use. To avoid confusion, they have not been re-used.
849  .sp  .sp
# Line 917  fallen out of use. To avoid confusion, t Line 917  fallen out of use. To avoid confusion, t
917    65  different names for subpatterns of the same number are    65  different names for subpatterns of the same number are
918          not allowed          not allowed
919    66  (*MARK) must have an argument    66  (*MARK) must have an argument
920    67  this version of PCRE is not compiled with Unicode property    67  this version of PCRE is not compiled with Unicode property
921          support          support
922    68  \ec must be followed by an ASCII character    68  \ec must be followed by an ASCII character
923    69  \ek is not followed by a braced, angle-bracketed, or quoted name    69  \ek is not followed by a braced, angle-bracketed, or quoted name
924    70  internal error: unknown opcode in find_fixedlength()    70  internal error: unknown opcode in find_fixedlength()
925    71  \eN is not supported in a class    71  \eN is not supported in a class
926    72  too many forward references    72  too many forward references
927    73  disallowed Unicode code point (>= 0xd800 && <= 0xdfff)    73  disallowed Unicode code point (>= 0xd800 && <= 0xdfff)
928    74  invalid UTF-16 string (specifically UTF-16)    74  invalid UTF-16 string (specifically UTF-16)
929  .sp  .sp
930  The numbers 32 and 10000 in errors 48 and 49 are defaults; different values may  The numbers 32 and 10000 in errors 48 and 49 are defaults; different values may
# Line 1120  the following negative numbers: Line 1120  the following negative numbers:
1120    PCRE_ERROR_NULL           the argument \fIcode\fP was NULL    PCRE_ERROR_NULL           the argument \fIcode\fP was NULL
1121                              the argument \fIwhere\fP was NULL                              the argument \fIwhere\fP was NULL
1122    PCRE_ERROR_BADMAGIC       the "magic number" was not found    PCRE_ERROR_BADMAGIC       the "magic number" was not found
1123    PCRE_ERROR_BADENDIANNESS  the pattern was compiled with different    PCRE_ERROR_BADENDIANNESS  the pattern was compiled with different
1124                              endianness                              endianness
1125    PCRE_ERROR_BADOPTION      the value of \fIwhat\fP was invalid    PCRE_ERROR_BADOPTION      the value of \fIwhat\fP was invalid
1126  .sp  .sp
1127  The "magic number" is placed at the start of each compiled pattern as an simple  The "magic number" is placed at the start of each compiled pattern as an simple
1128  check against passing an arbitrary memory pointer. The endianness error can  check against passing an arbitrary memory pointer. The endianness error can
1129  occur if a compiled pattern is saved and reloaded on a different host. Here is  occur if a compiled pattern is saved and reloaded on a different host. Here is
1130  a typical call of \fBpcre_fullinfo()\fP, to obtain the length of the compiled  a typical call of \fBpcre_fullinfo()\fP, to obtain the length of the compiled
1131  pattern:  pattern:
# Line 1168  where data units are bytes.) The fourth Line 1168  where data units are bytes.) The fourth
1168  variable.  variable.
1169  .P  .P
1170  If there is a fixed first value, for example, the letter "c" from a pattern  If there is a fixed first value, for example, the letter "c" from a pattern
1171  such as (cat|cow|coyote), its value is returned. In the 8-bit library, the  such as (cat|cow|coyote), its value is returned. In the 8-bit library, the
1172  value is always less than 256; in the 16-bit library the value can be up to  value is always less than 256; in the 16-bit library the value can be up to
1173  0xffff.  0xffff.
1174  .P  .P
1175  If there is no fixed first value, and if either  If there is no fixed first value, and if either
# Line 1459  fields (not necessarily in this order): Line 1459  fields (not necessarily in this order):
1459    const unsigned char *\fItables\fP;    const unsigned char *\fItables\fP;
1460    unsigned char **\fImark\fP;    unsigned char **\fImark\fP;
1461  .sp  .sp
1462  In the 16-bit version of this structure, the \fImark\fP field has type  In the 16-bit version of this structure, the \fImark\fP field has type
1463  "PCRE_UCHAR16 **".  "PCRE_UCHAR16 **".
1464  .P  .P
1465  The \fIflags\fP field is a bitmap that specifies which of the other fields  The \fIflags\fP field is a bitmap that specifies which of the other fields
# Line 2092  documentation for more details. Line 2092  documentation for more details.
2092  .sp  .sp
2093    PCRE_ERROR_BADMODE (-28)    PCRE_ERROR_BADMODE (-28)
2094  .sp  .sp
2095  This error is given if a pattern that was compiled by the 8-bit library is  This error is given if a pattern that was compiled by the 8-bit library is
2096  passed to a 16-bit library function, or vice versa.  passed to a 16-bit library function, or vice versa.
2097  .sp  .sp
2098    PCRE_ERROR_BADENDIANNESS (-29)    PCRE_ERROR_BADENDIANNESS (-29)
2099  .sp  .sp
2100  This error is given if a pattern that was compiled and saved is reloaded on a  This error is given if a pattern that was compiled and saved is reloaded on a
2101  host with different endianness. The utility function  host with different endianness. The utility function
2102  \fBpcre_pattern_to_host_byte_order()\fP can be used to convert such a pattern  \fBpcre_pattern_to_host_byte_order()\fP can be used to convert such a pattern
2103  so that it runs on the new host.  so that it runs on the new host.
2104  .P  .P
2105  Error numbers -16 to -20 and -22 are not used by \fBpcre_exec()\fP.  Error numbers -16 to -20 and -22 are not used by \fBpcre_exec()\fP.
# Line 2109  Error numbers -16 to -20 and -22 are not Line 2109  Error numbers -16 to -20 and -22 are not
2109  .SS "Reason codes for invalid UTF-8 strings"  .SS "Reason codes for invalid UTF-8 strings"
2110  .rs  .rs
2111  .sp  .sp
2112  This section applies only to the 8-bit library. The corresponding information  This section applies only to the 8-bit library. The corresponding information
2113  for the 16-bit library is given in the  for the 16-bit library is given in the
2114  .\" HREF  .\" HREF
2115  \fBpcre16\fP  \fBpcre16\fP
# Line 2417  will yield PCRE_ERROR_NOMATCH. Line 2417  will yield PCRE_ERROR_NOMATCH.
2417  .rs  .rs
2418  .sp  .sp
2419  Matching certain patterns using \fBpcre_exec()\fP can use a lot of process  Matching certain patterns using \fBpcre_exec()\fP can use a lot of process
2420  stack, which in certain environments can be rather limited in size. Some users  stack, which in certain environments can be rather limited in size. Some users
2421  find it helpful to have an estimate of the amount of stack that is used by  find it helpful to have an estimate of the amount of stack that is used by
2422  \fBpcre_exec()\fP, to help them set recursion limits, as described in the  \fBpcre_exec()\fP, to help them set recursion limits, as described in the
2423  .\" HREF  .\" HREF
2424  \fBpcrestack\fP  \fBpcrestack\fP
2425  .\"  .\"
2426  documentation. The estimate that is output by \fBpcretest\fP when called with  documentation. The estimate that is output by \fBpcretest\fP when called with
2427  the \fB-m\fP and \fB-C\fP options is obtained by calling \fBpcre_exec\fP with  the \fB-m\fP and \fB-C\fP options is obtained by calling \fBpcre_exec\fP with
2428  the values NULL, NULL, NULL, -999, and -999 for its first five arguments.  the values NULL, NULL, NULL, -999, and -999 for its first five arguments.
2429  .P  .P
2430  Normally, if its first argument is NULL, \fBpcre_exec()\fP immediately returns  Normally, if its first argument is NULL, \fBpcre_exec()\fP immediately returns
# Line 2432  the negative error code PCRE_ERROR_NULL, Line 2432  the negative error code PCRE_ERROR_NULL,
2432  arguments, it returns instead a negative number whose absolute value is the  arguments, it returns instead a negative number whose absolute value is the
2433  approximate stack frame size in bytes. (A negative number is used so that it is  approximate stack frame size in bytes. (A negative number is used so that it is
2434  clear that no match has happened.) The value is approximate because in some  clear that no match has happened.) The value is approximate because in some
2435  cases, recursive calls to \fBpcre_exec()\fP occur when there are one or two  cases, recursive calls to \fBpcre_exec()\fP occur when there are one or two
2436  additional variables on the stack.  additional variables on the stack.
2437  .P  .P
2438  If PCRE has been compiled to use the heap instead of the stack for recursion,  If PCRE has been compiled to use the heap instead of the stack for recursion,
2439  the value returned is the size of each block that is obtained from the heap.  the value returned is the size of each block that is obtained from the heap.
2440  .  .
2441  .  .

Legend:
Removed from v.902  
changed lines
  Added in v.903

  ViewVC Help
Powered by ViewVC 1.1.5