7 |
|
|
8 |
Please read the NEWS file if you are upgrading from a previous release. |
Please read the NEWS file if you are upgrading from a previous release. |
9 |
|
|
10 |
PCRE has its own native API, but a set of "wrapper" functions that are based on |
|
11 |
the POSIX API are also supplied in the library libpcreposix. Note that this |
The PCRE APIs |
12 |
just provides a POSIX calling interface to PCRE: the regular expressions |
------------- |
13 |
themselves still follow Perl syntax and semantics. The header file |
|
14 |
for the POSIX-style functions is called pcreposix.h. The official POSIX name is |
PCRE is written in C, and it has its own API. The distribution now includes a |
15 |
regex.h, but I didn't want to risk possible problems with existing files of |
set of C++ wrapper functions, courtesy of Google Inc. (see the pcrecpp man page |
16 |
that name by distributing it that way. To use it with an existing program that |
for details). |
17 |
uses the POSIX API, it will have to be renamed or pointed at by a link. |
|
18 |
|
Also included are a set of C wrapper functions that are based on the POSIX |
19 |
|
API. These end up in the library called libpcreposix. Note that this just |
20 |
|
provides a POSIX calling interface to PCRE: the regular expressions themselves |
21 |
|
still follow Perl syntax and semantics. The header file for the POSIX-style |
22 |
|
functions is called pcreposix.h. The official POSIX name is regex.h, but I |
23 |
|
didn't want to risk possible problems with existing files of that name by |
24 |
|
distributing it that way. To use it with an existing program that uses the |
25 |
|
POSIX API, it will have to be renamed or pointed at by a link. |
26 |
|
|
27 |
If you are using the POSIX interface to PCRE and there is already a POSIX regex |
If you are using the POSIX interface to PCRE and there is already a POSIX regex |
28 |
library installed on your system, you must take care when linking programs to |
library installed on your system, you must take care when linking programs to |
120 |
|
|
121 |
on the "configure" command. |
on the "configure" command. |
122 |
|
|
123 |
. PCRE has a counter which can be set to limit the amount of resources it uses. |
. PCRE has a counter that can be set to limit the amount of resources it uses. |
124 |
If the limit is exceeded during a match, the match fails. The default is ten |
If the limit is exceeded during a match, the match fails. The default is ten |
125 |
million. You can change the default by setting, for example, |
million. You can change the default by setting, for example, |
126 |
|
|
138 |
is a representation of the compiled pattern, and this changes with the link |
is a representation of the compiled pattern, and this changes with the link |
139 |
size. |
size. |
140 |
|
|
141 |
. You can build PCRE so that its match() function does not call itself |
. You can build PCRE so that its internal match() function that is called from |
142 |
recursively. Instead, it uses blocks of data from the heap via special |
pcre_exec() does not call itself recursively. Instead, it uses blocks of data |
143 |
functions pcre_stack_malloc() and pcre_stack_free() to save data that would |
from the heap via special functions pcre_stack_malloc() and pcre_stack_free() |
144 |
otherwise be saved on the stack. To build PCRE like this, use |
to save data that would otherwise be saved on the stack. To build PCRE like |
145 |
|
this, use |
146 |
|
|
147 |
--disable-stack-for-recursion |
--disable-stack-for-recursion |
148 |
|
|
149 |
on the "configure" command. PCRE runs more slowly in this mode, but it may be |
on the "configure" command. PCRE runs more slowly in this mode, but it may be |
150 |
necessary in environments with limited stack sizes. |
necessary in environments with limited stack sizes. This applies only to the |
151 |
|
pcre_exec() function; it does not apply to pcre_dfa_exec(), which does not |
152 |
|
use deeply nested recursion. |
153 |
|
|
154 |
|
The "configure" script builds eight files for the basic C library: |
155 |
|
|
156 |
|
. pcre.h is the header file for C programs that call PCRE |
157 |
|
. Makefile is the makefile that builds the library |
158 |
|
. config.h contains build-time configuration options for the library |
159 |
|
. pcre-config is a script that shows the settings of "configure" options |
160 |
|
. libpcre.pc is data for the pkg-config command |
161 |
|
. libtool is a script that builds shared and/or static libraries |
162 |
|
. RunTest is a script for running tests on the library |
163 |
|
. RunGrepTest is a script for running tests on the pcregrep command |
164 |
|
|
165 |
The "configure" script builds seven files: |
In addition, if a C++ compiler is found, the following are also built: |
166 |
|
|
167 |
. pcre.h is build by copying pcre.in and making substitutions |
. pcrecpp.h is the header file for programs that call PCRE via the C++ wrapper |
168 |
. Makefile is built by copying Makefile.in and making substitutions. |
. pcre_stringpiece.h is the header for the C++ "stringpiece" functions |
|
. config.h is built by copying config.in and making substitutions. |
|
|
. pcre-config is built by copying pcre-config.in and making substitutions. |
|
|
. libpcre.pc is data for the pkg-config command, built from libpcre.pc.in |
|
|
. libtool is a script that builds shared and/or static libraries |
|
|
. RunTest is a script for running tests |
|
169 |
|
|
170 |
Once "configure" has run, you can run "make". It builds two libraries called |
The "configure" script also creates config.status, which is an executable |
171 |
|
script that can be run to recreate the configuration, and config.log, which |
172 |
|
contains compiler output from tests that "configure" runs. |
173 |
|
|
174 |
|
Once "configure" has run, you can run "make". It builds two libraries, called |
175 |
libpcre and libpcreposix, a test program called pcretest, and the pcregrep |
libpcre and libpcreposix, a test program called pcretest, and the pcregrep |
176 |
command. You can use "make install" to copy these, the public header files |
command. If a C++ compiler was found on your system, it also builds the C++ |
177 |
pcre.h and pcreposix.h, and the man pages to appropriate live directories on |
wrapper library, which is called libpcrecpp, and some test programs called |
178 |
your system, in the normal way. |
pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest. |
179 |
|
|
180 |
|
The command "make test" runs all the appropriate tests. Details of the PCRE |
181 |
|
tests are given in a separate section of this document, below. |
182 |
|
|
183 |
|
You can use "make install" to copy the libraries, the public header files |
184 |
|
pcre.h, pcreposix.h, pcrecpp.h, and pcre_stringpiece.h (the last two only if |
185 |
|
the C++ wrapper was built), and the man pages to appropriate live directories |
186 |
|
on your system, in the normal way. |
187 |
|
|
188 |
|
If you want to remove PCRE from your system, you can run "make uninstall". |
189 |
|
This removes all the files that "make install" installed. However, it does not |
190 |
|
remove any directories, because these are often shared with other programs. |
191 |
|
|
192 |
|
|
193 |
Retrieving configuration information on Unix-like systems |
Retrieving configuration information on Unix-like systems |
220 |
Shared libraries on Unix-like systems |
Shared libraries on Unix-like systems |
221 |
------------------------------------- |
------------------------------------- |
222 |
|
|
223 |
The default distribution builds PCRE as two shared libraries and two static |
The default distribution builds PCRE as shared libraries and static libraries, |
224 |
libraries, as long as the operating system supports shared libraries. Shared |
as long as the operating system supports shared libraries. Shared library |
225 |
library support relies on the "libtool" script which is built as part of the |
support relies on the "libtool" script which is built as part of the |
226 |
"configure" process. |
"configure" process. |
227 |
|
|
228 |
The libtool script is used to compile and link both shared and static |
The libtool script is used to compile and link both shared and static |
251 |
process, the dftables.c source file is compiled *and run* on the local host, in |
process, the dftables.c source file is compiled *and run* on the local host, in |
252 |
order to generate the default character tables (the chartables.c file). It |
order to generate the default character tables (the chartables.c file). It |
253 |
therefore needs to be compiled with the local compiler, not the cross compiler. |
therefore needs to be compiled with the local compiler, not the cross compiler. |
254 |
You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD) |
You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD; |
255 |
|
there are also CXX_FOR_BUILD and CXXFLAGS_FOR_BUILD for the C++ wrapper) |
256 |
when calling the "configure" command. If they are not specified, they default |
when calling the "configure" command. If they are not specified, they default |
257 |
to the values of CC and CFLAGS. |
to the values of CC and CFLAGS. |
258 |
|
|
274 |
------------ |
------------ |
275 |
|
|
276 |
To test PCRE on a Unix system, run the RunTest script that is created by the |
To test PCRE on a Unix system, run the RunTest script that is created by the |
277 |
configuring process. (This can also be run by "make runtest", "make check", or |
configuring process. There is also a script called RunGrepTest that tests the |
278 |
"make test".) For other systems, see the instructions in NON-UNIX-USE. |
options of the pcregrep command. If the C++ wrapper library is build, three |
279 |
|
test programs called pcrecpp_unittest, pcre_scanner_unittest, and |
280 |
The script runs the pcretest test program (which is documented in its own man |
pcre_stringpiece_unittest are provided. |
281 |
page) on each of the testinput files (in the testdata directory) in turn, |
|
282 |
and compares the output with the contents of the corresponding testoutput file. |
Both the scripts and all the program tests are run if you obey "make runtest", |
283 |
A file called testtry is used to hold the main output from pcretest |
"make check", or "make test". For other systems, see the instructions in |
284 |
|
NON-UNIX-USE. |
285 |
|
|
286 |
|
The RunTest script runs the pcretest test program (which is documented in its |
287 |
|
own man page) on each of the testinput files (in the testdata directory) in |
288 |
|
turn, and compares the output with the contents of the corresponding testoutput |
289 |
|
file. A file called testtry is used to hold the main output from pcretest |
290 |
(testsavedregex is also used as a working file). To run pcretest on just one of |
(testsavedregex is also used as a working file). To run pcretest on just one of |
291 |
the test files, give its number as an argument to RunTest, for example: |
the test files, give its number as an argument to RunTest, for example: |
292 |
|
|
334 |
The fifth test checks error handling with UTF-8 encoding, and internal UTF-8 |
The fifth test checks error handling with UTF-8 encoding, and internal UTF-8 |
335 |
features of PCRE that are not relevant to Perl. |
features of PCRE that are not relevant to Perl. |
336 |
|
|
337 |
The sixth and final test checks the support for Unicode character properties. |
The sixth and test checks the support for Unicode character properties. It it |
338 |
It it not run automatically unless PCRE is built with Unicode property support. |
not run automatically unless PCRE is built with Unicode property support. To to |
339 |
To to this you must set --enable-unicode-properties when running "configure". |
this you must set --enable-unicode-properties when running "configure". |
340 |
|
|
341 |
|
The seventh, eighth, and ninth tests check the pcre_dfa_exec() alternative |
342 |
|
matching function, in non-UTF-8 mode, UTF-8 mode, and UTF-8 mode with Unicode |
343 |
|
property support, respectively. The eighth and ninth tests are not run |
344 |
|
automatically unless PCRE is build with the relevant support. |
345 |
|
|
346 |
|
|
347 |
Character tables |
Character tables |
393 |
|
|
394 |
dftables.c auxiliary program for building chartables.c |
dftables.c auxiliary program for building chartables.c |
395 |
|
|
|
get.c ) |
|
|
maketables.c ) |
|
|
study.c ) source of the functions |
|
|
pcre.c ) in the library |
|
396 |
pcreposix.c ) |
pcreposix.c ) |
397 |
printint.c ) |
pcre_compile.c ) |
398 |
|
pcre_config.c ) |
399 |
|
pcre_dfa_exec.c ) |
400 |
|
pcre_exec.c ) |
401 |
|
pcre_fullinfo.c ) |
402 |
|
pcre_get.c ) sources for the functions in the library, |
403 |
|
pcre_globals.c ) and some internal functions that they use |
404 |
|
pcre_info.c ) |
405 |
|
pcre_maketables.c ) |
406 |
|
pcre_ord2utf8.c ) |
407 |
|
pcre_printint.c ) |
408 |
|
pcre_study.c ) |
409 |
|
pcre_tables.c ) |
410 |
|
pcre_try_flipped.c ) |
411 |
|
pcre_ucp_findchar.c ) |
412 |
|
pcre_valid_utf8.c ) |
413 |
|
pcre_version.c ) |
414 |
|
pcre_xclass.c ) |
415 |
|
|
416 |
ucp.c ) |
ucp_findchar.c ) |
417 |
ucp.h ) source for the code that is used for |
ucp.h ) source for the code that is used for |
418 |
ucpinternal.h ) Unicode property handling |
ucpinternal.h ) Unicode property handling |
419 |
ucptable.c ) |
ucptable.c ) |
422 |
pcre.in "source" for the header for the external API; pcre.h |
pcre.in "source" for the header for the external API; pcre.h |
423 |
is built from this by "configure" |
is built from this by "configure" |
424 |
pcreposix.h header for the external POSIX wrapper API |
pcreposix.h header for the external POSIX wrapper API |
425 |
internal.h header for internal use |
pcre_internal.h header for internal use |
426 |
config.in template for config.h, which is built by configure |
config.in template for config.h, which is built by configure |
427 |
|
|
428 |
|
pcrecpp.h.in "source" for the header file for the C++ wrapper |
429 |
|
pcrecpp.cc ) |
430 |
|
pcre_scanner.cc ) source for the C++ wrapper library |
431 |
|
|
432 |
|
pcre_stringpiece.h.in "source" for pcre_stringpiece.h, the header for the |
433 |
|
C++ stringpiece functions |
434 |
|
pcre_stringpiece.cc source for the C++ stringpiece functions |
435 |
|
|
436 |
(B) Auxiliary files: |
(B) Auxiliary files: |
437 |
|
|
438 |
AUTHORS information about the author of PCRE |
AUTHORS information about the author of PCRE |
445 |
NON-UNIX-USE notes on building PCRE on non-Unix systems |
NON-UNIX-USE notes on building PCRE on non-Unix systems |
446 |
README this file |
README this file |
447 |
RunTest.in template for a Unix shell script for running tests |
RunTest.in template for a Unix shell script for running tests |
448 |
|
RunGrepTest.in template for a Unix shell script for pcregrep tests |
449 |
config.guess ) files used by libtool, |
config.guess ) files used by libtool, |
450 |
config.sub ) used only when building a shared library |
config.sub ) used only when building a shared library |
451 |
configure a configuring shell script (built by autoconf) |
configure a configuring shell script (built by autoconf) |
466 |
perltest Perl test program |
perltest Perl test program |
467 |
pcregrep.c source of a grep utility that uses PCRE |
pcregrep.c source of a grep utility that uses PCRE |
468 |
pcre-config.in source of script which retains PCRE information |
pcre-config.in source of script which retains PCRE information |
469 |
testdata/testinput1 test data, compatible with Perl |
pcrecpp_unittest.c ) |
470 |
testdata/testinput2 test data for error messages and non-Perl things |
pcre_scanner_unittest.c ) test programs for the C++ wrapper |
471 |
testdata/testinput3 test data for locale-specific tests |
pcre_stringpiece_unittest.c ) |
472 |
testdata/testinput4 test data for UTF-8 tests compatible with Perl |
testdata/testinput* test data for main library tests |
473 |
testdata/testinput5 test data for other UTF-8 tests |
testdata/testoutput* expected test results |
474 |
testdata/testinput6 test data for Unicode property support tests |
testdata/grep* input and output for pcregrep tests |
|
testdata/testoutput1 test results corresponding to testinput1 |
|
|
testdata/testoutput2 test results corresponding to testinput2 |
|
|
testdata/testoutput3 test results corresponding to testinput3 |
|
|
testdata/testoutput4 test results corresponding to testinput4 |
|
|
testdata/testoutput5 test results corresponding to testinput5 |
|
|
testdata/testoutput6 test results corresponding to testinput6 |
|
475 |
|
|
476 |
(C) Auxiliary files for Win32 DLL |
(C) Auxiliary files for Win32 DLL |
477 |
|
|
|
dll.mk |
|
478 |
libpcre.def |
libpcre.def |
479 |
libpcreposix.def |
libpcreposix.def |
480 |
pcre.def |
pcre.def |
483 |
|
|
484 |
makevp.bat |
makevp.bat |
485 |
|
|
486 |
Philip Hazel <ph10@cam.ac.uk> |
Philip Hazel |
487 |
September 2004 |
Email local part: ph10 |
488 |
|
Email domain: cam.ac.uk |
489 |
|
June 2005 |