34 |
---------------------- |
---------------------- |
35 |
|
|
36 |
If you install PCRE in the normal way, you will end up with an installed set of |
If you install PCRE in the normal way, you will end up with an installed set of |
37 |
man pages whose names all start with "pcre". The one that is called "pcre" |
man pages whose names all start with "pcre". The one that is just called "pcre" |
38 |
lists all the others. In addition to these man pages, the PCRE documentation is |
lists all the others. In addition to these man pages, the PCRE documentation is |
39 |
supplied in two other forms; however, as there is no standard place to install |
supplied in two other forms; however, as there is no standard place to install |
40 |
them, they are left in the doc directory of the unpacked source distribution. |
them, they are left in the doc directory of the unpacked source distribution. |
114 |
. If, in addition to support for UTF-8 character strings, you want to include |
. If, in addition to support for UTF-8 character strings, you want to include |
115 |
support for the \P, \p, and \X sequences that recognize Unicode character |
support for the \P, \p, and \X sequences that recognize Unicode character |
116 |
properties, you must add --enable-unicode-properties to the "configure" |
properties, you must add --enable-unicode-properties to the "configure" |
117 |
command. This adds about 90K to the size of the library (in the form of a |
command. This adds about 30K to the size of the library (in the form of a |
118 |
property table); only the basic two-letter properties such as Lu are |
property table); only the basic two-letter properties such as Lu are |
119 |
supported. |
supported. |
120 |
|
|
121 |
. You can build PCRE to recognize either CR or LF as the newline character, |
. You can build PCRE to recognize either CR or LF or the sequence CRLF or any |
122 |
instead of whatever your compiler uses for "\n", by adding --newline-is-cr or |
of the Unicode newline sequences as indicating the end of a line. Whatever |
123 |
--newline-is-lf to the "configure" command, respectively. Only do this if you |
you specify at build time is the default; the caller of PCRE can change the |
124 |
really understand what you are doing. On traditional Unix-like systems, the |
selection at run time. The default newline indicator is a single LF character |
125 |
newline character is LF. |
(the Unix standard). You can specify the default newline indicator by adding |
126 |
|
--newline-is-cr or --newline-is-lf or --newline-is-crlf or --newline-is-any |
127 |
|
to the "configure" command, respectively. |
128 |
|
|
129 |
. When called via the POSIX interface, PCRE uses malloc() to get additional |
. When called via the POSIX interface, PCRE uses malloc() to get additional |
130 |
storage for processing capturing parentheses if there are more than 10 of |
storage for processing capturing parentheses if there are more than 10 of |
144 |
pcre_exec() can supply their own value. There is discussion on the pcreapi |
pcre_exec() can supply their own value. There is discussion on the pcreapi |
145 |
man page. |
man page. |
146 |
|
|
147 |
|
. There is a separate counter that limits the depth of recursive function calls |
148 |
|
during a matching process. This also has a default of ten million, which is |
149 |
|
essentially "unlimited". You can change the default by setting, for example, |
150 |
|
|
151 |
|
--with-match-limit-recursion=500000 |
152 |
|
|
153 |
|
Recursive function calls use up the runtime stack; running out of stack can |
154 |
|
cause programs to crash in strange ways. There is a discussion about stack |
155 |
|
sizes in the pcrestack man page. |
156 |
|
|
157 |
. The default maximum compiled pattern size is around 64K. You can increase |
. The default maximum compiled pattern size is around 64K. You can increase |
158 |
this by adding --with-link-size=3 to the "configure" command. You can |
this by adding --with-link-size=3 to the "configure" command. You can |
159 |
increase it even more by setting --with-link-size=4, but this is unlikely |
increase it even more by setting --with-link-size=4, but this is unlikely |
177 |
|
|
178 |
The "configure" script builds eight files for the basic C library: |
The "configure" script builds eight files for the basic C library: |
179 |
|
|
|
. pcre.h is the header file for C programs that call PCRE |
|
180 |
. Makefile is the makefile that builds the library |
. Makefile is the makefile that builds the library |
181 |
. config.h contains build-time configuration options for the library |
. config.h contains build-time configuration options for the library |
182 |
. pcre-config is a script that shows the settings of "configure" options |
. pcre-config is a script that shows the settings of "configure" options |
283 |
Using HP's ANSI C++ compiler (aCC) |
Using HP's ANSI C++ compiler (aCC) |
284 |
---------------------------------- |
---------------------------------- |
285 |
|
|
286 |
Unless C++ support is disabled by specifiying the "--disable-cpp" option of the |
Unless C++ support is disabled by specifying the "--disable-cpp" option of the |
287 |
"configure" script, you *must* include the "-AA" option in the CXXFLAGS |
"configure" script, you *must* include the "-AA" option in the CXXFLAGS |
288 |
environment variable in order for the C++ components to compile correctly. |
environment variable in order for the C++ components to compile correctly. |
289 |
|
|
305 |
|
|
306 |
PCRE has been compiled on Windows systems and on Macintoshes, but I don't know |
PCRE has been compiled on Windows systems and on Macintoshes, but I don't know |
307 |
the details because I don't use those systems. It should be straightforward to |
the details because I don't use those systems. It should be straightforward to |
308 |
build PCRE on any system that has a Standard C compiler, because it uses only |
build PCRE on any system that has a Standard C compiler and library, because it |
309 |
Standard C functions. |
uses only Standard C functions. |
310 |
|
|
311 |
|
|
312 |
Testing PCRE |
Testing PCRE |
325 |
The RunTest script runs the pcretest test program (which is documented in its |
The RunTest script runs the pcretest test program (which is documented in its |
326 |
own man page) on each of the testinput files (in the testdata directory) in |
own man page) on each of the testinput files (in the testdata directory) in |
327 |
turn, and compares the output with the contents of the corresponding testoutput |
turn, and compares the output with the contents of the corresponding testoutput |
328 |
file. A file called testtry is used to hold the main output from pcretest |
files. A file called testtry is used to hold the main output from pcretest |
329 |
(testsavedregex is also used as a working file). To run pcretest on just one of |
(testsavedregex is also used as a working file). To run pcretest on just one of |
330 |
the test files, give its number as an argument to RunTest, for example: |
the test files, give its number as an argument to RunTest, for example: |
331 |
|
|
332 |
RunTest 2 |
RunTest 2 |
333 |
|
|
334 |
The first file can also be fed directly into the perltest script to check that |
The first test file can also be fed directly into the perltest script to check |
335 |
Perl gives the same results. The only difference you should see is in the first |
that Perl gives the same results. The only difference you should see is in the |
336 |
few lines, where the Perl version is given instead of the PCRE version. |
first few lines, where the Perl version is given instead of the PCRE version. |
337 |
|
|
338 |
The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(), |
The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(), |
339 |
pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error |
pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error |
442 |
pcre_globals.c ) and some internal functions that they use |
pcre_globals.c ) and some internal functions that they use |
443 |
pcre_info.c ) |
pcre_info.c ) |
444 |
pcre_maketables.c ) |
pcre_maketables.c ) |
445 |
|
pcre_newline.c ) |
446 |
pcre_ord2utf8.c ) |
pcre_ord2utf8.c ) |
447 |
pcre_printint.c ) |
pcre_refcount.c ) |
448 |
pcre_study.c ) |
pcre_study.c ) |
449 |
pcre_tables.c ) |
pcre_tables.c ) |
450 |
pcre_try_flipped.c ) |
pcre_try_flipped.c ) |
451 |
pcre_ucp_findchar.c ) |
pcre_ucp_searchfuncs.c) |
452 |
pcre_valid_utf8.c ) |
pcre_valid_utf8.c ) |
453 |
pcre_version.c ) |
pcre_version.c ) |
454 |
pcre_xclass.c ) |
pcre_xclass.c ) |
|
|
|
|
ucp_findchar.c ) |
|
|
ucp.h ) source for the code that is used for |
|
|
ucpinternal.h ) Unicode property handling |
|
455 |
ucptable.c ) |
ucptable.c ) |
|
ucptypetable.c ) |
|
456 |
|
|
457 |
pcre.in "source" for the header for the external API; pcre.h |
pcre_printint.src ) debugging function that is #included in pcretest, and |
458 |
is built from this by "configure" |
) can also be #included in pcre_compile() |
459 |
|
|
460 |
|
pcre.h the public PCRE header file |
461 |
pcreposix.h header for the external POSIX wrapper API |
pcreposix.h header for the external POSIX wrapper API |
462 |
pcre_internal.h header for internal use |
pcre_internal.h header for internal use |
463 |
|
ucp.h ) headers concerned with |
464 |
|
ucpinternal.h ) Unicode property handling |
465 |
config.in template for config.h, which is built by configure |
config.in template for config.h, which is built by configure |
466 |
|
|
467 |
pcrecpp.h the header file for the C++ wrapper |
pcrecpp.h the header file for the C++ wrapper |
488 |
RunGrepTest.in template for a Unix shell script for pcregrep tests |
RunGrepTest.in template for a Unix shell script for pcregrep tests |
489 |
config.guess ) files used by libtool, |
config.guess ) files used by libtool, |
490 |
config.sub ) used only when building a shared library |
config.sub ) used only when building a shared library |
491 |
|
config.h.in "source" for the config.h header file |
492 |
configure a configuring shell script (built by autoconf) |
configure a configuring shell script (built by autoconf) |
493 |
configure.in the autoconf input used to build configure |
configure.ac the autoconf input used to build configure |
494 |
doc/Tech.Notes notes on the encoding |
doc/Tech.Notes notes on the encoding |
495 |
doc/*.3 man page sources for the PCRE functions |
doc/*.3 man page sources for the PCRE functions |
496 |
doc/*.1 man page sources for pcregrep and pcretest |
doc/*.1 man page sources for pcregrep and pcretest |
518 |
|
|
519 |
libpcre.def |
libpcre.def |
520 |
libpcreposix.def |
libpcreposix.def |
|
pcre.def |
|
521 |
|
|
522 |
(D) Auxiliary file for VPASCAL |
(D) Auxiliary file for VPASCAL |
523 |
|
|
526 |
Philip Hazel |
Philip Hazel |
527 |
Email local part: ph10 |
Email local part: ph10 |
528 |
Email domain: cam.ac.uk |
Email domain: cam.ac.uk |
529 |
January 2006 |
November 2006 |