34 |
---------------------- |
---------------------- |
35 |
|
|
36 |
If you install PCRE in the normal way, you will end up with an installed set of |
If you install PCRE in the normal way, you will end up with an installed set of |
37 |
man pages whose names all start with "pcre". The one that is called "pcre" |
man pages whose names all start with "pcre". The one that is just called "pcre" |
38 |
lists all the others. In addition to these man pages, the PCRE documentation is |
lists all the others. In addition to these man pages, the PCRE documentation is |
39 |
supplied in two other forms; however, as there is no standard place to install |
supplied in two other forms; however, as there is no standard place to install |
40 |
them, they are left in the doc directory of the unpacked source distribution. |
them, they are left in the doc directory of the unpacked source distribution. |
114 |
. If, in addition to support for UTF-8 character strings, you want to include |
. If, in addition to support for UTF-8 character strings, you want to include |
115 |
support for the \P, \p, and \X sequences that recognize Unicode character |
support for the \P, \p, and \X sequences that recognize Unicode character |
116 |
properties, you must add --enable-unicode-properties to the "configure" |
properties, you must add --enable-unicode-properties to the "configure" |
117 |
command. This adds about 90K to the size of the library (in the form of a |
command. This adds about 30K to the size of the library (in the form of a |
118 |
property table); only the basic two-letter properties such as Lu are |
property table); only the basic two-letter properties such as Lu are |
119 |
supported. |
supported. |
120 |
|
|
121 |
. You can build PCRE to recognize either CR or LF as the newline character, |
. You can build PCRE to recognize either CR or LF or the sequence CRLF or any |
122 |
instead of whatever your compiler uses for "\n", by adding --newline-is-cr or |
of the Unicode newline sequences as indicating the end of a line. Whatever |
123 |
--newline-is-lf to the "configure" command, respectively. Only do this if you |
you specify at build time is the default; the caller of PCRE can change the |
124 |
really understand what you are doing. On traditional Unix-like systems, the |
selection at run time. The default newline indicator is a single LF character |
125 |
newline character is LF. |
(the Unix standard). You can specify the default newline indicator by adding |
126 |
|
--newline-is-cr or --newline-is-lf or --newline-is-crlf or --newline-is-any |
127 |
|
to the "configure" command, respectively. |
128 |
|
|
129 |
|
If you specify --newline-is-cr or --newline-is-crlf, some of the standard |
130 |
|
tests will fail, because the lines in the test files end with LF. Even if |
131 |
|
the files are edited to change the line endings, there are likely to be some |
132 |
|
failures. With --newline-is-any, many tests should succeed, but there may be |
133 |
|
some failures. |
134 |
|
|
135 |
. When called via the POSIX interface, PCRE uses malloc() to get additional |
. When called via the POSIX interface, PCRE uses malloc() to get additional |
136 |
storage for processing capturing parentheses if there are more than 10 of |
storage for processing capturing parentheses if there are more than 10 of |
150 |
pcre_exec() can supply their own value. There is discussion on the pcreapi |
pcre_exec() can supply their own value. There is discussion on the pcreapi |
151 |
man page. |
man page. |
152 |
|
|
153 |
|
. There is a separate counter that limits the depth of recursive function calls |
154 |
|
during a matching process. This also has a default of ten million, which is |
155 |
|
essentially "unlimited". You can change the default by setting, for example, |
156 |
|
|
157 |
|
--with-match-limit-recursion=500000 |
158 |
|
|
159 |
|
Recursive function calls use up the runtime stack; running out of stack can |
160 |
|
cause programs to crash in strange ways. There is a discussion about stack |
161 |
|
sizes in the pcrestack man page. |
162 |
|
|
163 |
. The default maximum compiled pattern size is around 64K. You can increase |
. The default maximum compiled pattern size is around 64K. You can increase |
164 |
this by adding --with-link-size=3 to the "configure" command. You can |
this by adding --with-link-size=3 to the "configure" command. You can |
165 |
increase it even more by setting --with-link-size=4, but this is unlikely |
increase it even more by setting --with-link-size=4, but this is unlikely |
183 |
|
|
184 |
The "configure" script builds eight files for the basic C library: |
The "configure" script builds eight files for the basic C library: |
185 |
|
|
|
. pcre.h is the header file for C programs that call PCRE |
|
186 |
. Makefile is the makefile that builds the library |
. Makefile is the makefile that builds the library |
187 |
. config.h contains build-time configuration options for the library |
. config.h contains build-time configuration options for the library |
188 |
. pcre-config is a script that shows the settings of "configure" options |
. pcre-config is a script that shows the settings of "configure" options |
289 |
Using HP's ANSI C++ compiler (aCC) |
Using HP's ANSI C++ compiler (aCC) |
290 |
---------------------------------- |
---------------------------------- |
291 |
|
|
292 |
Unless C++ support is disabled by specifiying the "--disable-cpp" option of the |
Unless C++ support is disabled by specifying the "--disable-cpp" option of the |
293 |
"configure" script, you *must* include the "-AA" option in the CXXFLAGS |
"configure" script, you *must* include the "-AA" option in the CXXFLAGS |
294 |
environment variable in order for the C++ components to compile correctly. |
environment variable in order for the C++ components to compile correctly. |
295 |
|
|
311 |
|
|
312 |
PCRE has been compiled on Windows systems and on Macintoshes, but I don't know |
PCRE has been compiled on Windows systems and on Macintoshes, but I don't know |
313 |
the details because I don't use those systems. It should be straightforward to |
the details because I don't use those systems. It should be straightforward to |
314 |
build PCRE on any system that has a Standard C compiler, because it uses only |
build PCRE on any system that has a Standard C compiler and library, because it |
315 |
Standard C functions. |
uses only Standard C functions. |
316 |
|
|
317 |
|
|
318 |
Testing PCRE |
Testing PCRE |
331 |
The RunTest script runs the pcretest test program (which is documented in its |
The RunTest script runs the pcretest test program (which is documented in its |
332 |
own man page) on each of the testinput files (in the testdata directory) in |
own man page) on each of the testinput files (in the testdata directory) in |
333 |
turn, and compares the output with the contents of the corresponding testoutput |
turn, and compares the output with the contents of the corresponding testoutput |
334 |
file. A file called testtry is used to hold the main output from pcretest |
files. A file called testtry is used to hold the main output from pcretest |
335 |
(testsavedregex is also used as a working file). To run pcretest on just one of |
(testsavedregex is also used as a working file). To run pcretest on just one of |
336 |
the test files, give its number as an argument to RunTest, for example: |
the test files, give its number as an argument to RunTest, for example: |
337 |
|
|
338 |
RunTest 2 |
RunTest 2 |
339 |
|
|
340 |
The first file can also be fed directly into the perltest script to check that |
The first test file can also be fed directly into the perltest script to check |
341 |
Perl gives the same results. The only difference you should see is in the first |
that Perl gives the same results. The only difference you should see is in the |
342 |
few lines, where the Perl version is given instead of the PCRE version. |
first few lines, where the Perl version is given instead of the PCRE version. |
343 |
|
|
344 |
The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(), |
The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(), |
345 |
pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error |
pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error |
448 |
pcre_globals.c ) and some internal functions that they use |
pcre_globals.c ) and some internal functions that they use |
449 |
pcre_info.c ) |
pcre_info.c ) |
450 |
pcre_maketables.c ) |
pcre_maketables.c ) |
451 |
|
pcre_newline.c ) |
452 |
pcre_ord2utf8.c ) |
pcre_ord2utf8.c ) |
453 |
pcre_printint.c ) |
pcre_refcount.c ) |
454 |
pcre_study.c ) |
pcre_study.c ) |
455 |
pcre_tables.c ) |
pcre_tables.c ) |
456 |
pcre_try_flipped.c ) |
pcre_try_flipped.c ) |
457 |
pcre_ucp_findchar.c ) |
pcre_ucp_searchfuncs.c) |
458 |
pcre_valid_utf8.c ) |
pcre_valid_utf8.c ) |
459 |
pcre_version.c ) |
pcre_version.c ) |
460 |
pcre_xclass.c ) |
pcre_xclass.c ) |
461 |
|
|
462 |
ucp_findchar.c ) |
pcre_printint.src ) debugging function that is #included in pcretest, and |
463 |
ucp.h ) source for the code that is used for |
) can also be #included in pcre_compile() |
|
ucpinternal.h ) Unicode property handling |
|
|
ucptable.c ) |
|
|
ucptypetable.c ) |
|
464 |
|
|
465 |
pcre.in "source" for the header for the external API; pcre.h |
pcre.h the public PCRE header file |
|
is built from this by "configure" |
|
466 |
pcreposix.h header for the external POSIX wrapper API |
pcreposix.h header for the external POSIX wrapper API |
467 |
pcre_internal.h header for internal use |
pcre_internal.h header for internal use |
468 |
|
ucp.h ) headers concerned with |
469 |
|
ucpinternal.h ) Unicode property handling |
470 |
|
ucptable.h ) (this one is the data table) |
471 |
config.in template for config.h, which is built by configure |
config.in template for config.h, which is built by configure |
472 |
|
|
473 |
pcrecpp.h the header file for the C++ wrapper |
pcrecpp.h the header file for the C++ wrapper |
494 |
RunGrepTest.in template for a Unix shell script for pcregrep tests |
RunGrepTest.in template for a Unix shell script for pcregrep tests |
495 |
config.guess ) files used by libtool, |
config.guess ) files used by libtool, |
496 |
config.sub ) used only when building a shared library |
config.sub ) used only when building a shared library |
497 |
|
config.h.in "source" for the config.h header file |
498 |
configure a configuring shell script (built by autoconf) |
configure a configuring shell script (built by autoconf) |
499 |
configure.in the autoconf input used to build configure |
configure.ac the autoconf input used to build configure |
500 |
doc/Tech.Notes notes on the encoding |
doc/Tech.Notes notes on the encoding |
501 |
doc/*.3 man page sources for the PCRE functions |
doc/*.3 man page sources for the PCRE functions |
502 |
doc/*.1 man page sources for pcregrep and pcretest |
doc/*.1 man page sources for pcregrep and pcretest |
510 |
mkinstalldirs script for making install directories |
mkinstalldirs script for making install directories |
511 |
pcretest.c comprehensive test program |
pcretest.c comprehensive test program |
512 |
pcredemo.c simple demonstration of coding calls to PCRE |
pcredemo.c simple demonstration of coding calls to PCRE |
513 |
perltest Perl test program |
perltest.pl Perl test program |
514 |
pcregrep.c source of a grep utility that uses PCRE |
pcregrep.c source of a grep utility that uses PCRE |
515 |
pcre-config.in source of script which retains PCRE information |
pcre-config.in source of script which retains PCRE information |
516 |
pcrecpp_unittest.c ) |
pcrecpp_unittest.c ) |
524 |
|
|
525 |
libpcre.def |
libpcre.def |
526 |
libpcreposix.def |
libpcreposix.def |
|
pcre.def |
|
527 |
|
|
528 |
(D) Auxiliary file for VPASCAL |
(D) Auxiliary file for VPASCAL |
529 |
|
|
532 |
Philip Hazel |
Philip Hazel |
533 |
Email local part: ph10 |
Email local part: ph10 |
534 |
Email domain: cam.ac.uk |
Email domain: cam.ac.uk |
535 |
January 2006 |
March 2007 |