37 |
|
|
38 |
ChangeLog log of changes to the code |
ChangeLog log of changes to the code |
39 |
LICENCE conditions for the use of PCRE |
LICENCE conditions for the use of PCRE |
40 |
Makefile for building PCRE |
Makefile for building PCRE in Unix systems |
41 |
README this file |
README this file |
42 |
RunTest a shell script for running tests |
RunTest a Unix shell script for running tests |
43 |
Tech.Notes notes on the encoding |
Tech.Notes notes on the encoding |
44 |
pcre.3 man page for the functions |
pcre.3 man page source for the functions |
45 |
pcreposix.3 man page for the POSIX wrapper API |
pcre.3.txt plain text version |
46 |
|
pcre.3.html HTML version |
47 |
|
pcreposix.3 man page source for the POSIX wrapper API |
48 |
|
pcreposix.3.txt plain text version |
49 |
|
pcreposix.3.HTML HTML version |
50 |
dftables.c auxiliary program for building chartables.c |
dftables.c auxiliary program for building chartables.c |
51 |
get.c ) |
get.c ) |
52 |
maketables.c ) |
maketables.c ) |
57 |
pcreposix.h header for the external POSIX wrapper API |
pcreposix.h header for the external POSIX wrapper API |
58 |
internal.h header for internal use |
internal.h header for internal use |
59 |
pcretest.c test program |
pcretest.c test program |
60 |
pgrep.1 man page for pgrep |
pgrep.1 man page source for pgrep |
61 |
|
pgrep.1.txt plain text version |
62 |
|
pgrep.1.HTML HTML version |
63 |
pgrep.c source of a grep utility that uses PCRE |
pgrep.c source of a grep utility that uses PCRE |
64 |
perltest Perl test program |
perltest Perl test program |
65 |
testinput1 test data, compatible with Perl 5.004 and 5.005 |
testinput1 test data, compatible with Perl 5.004 and 5.005 |
70 |
testoutput2 test results corresponding to testinput2 |
testoutput2 test results corresponding to testinput2 |
71 |
testoutput3 test results corresponding to testinput3 |
testoutput3 test results corresponding to testinput3 |
72 |
testoutput4 test results corresponding to testinput4 |
testoutput4 test results corresponding to testinput4 |
73 |
|
dll.mk for Win32 DLL |
74 |
|
pcre.def ditto |
75 |
|
|
76 |
To build PCRE, edit Makefile for your system (it is a fairly simple make file, |
To build PCRE on a Unix system, first edit Makefile for your system. It is a |
77 |
and there are some comments at the top) and then run it. It builds two |
fairly simple make file, and there are some comments near the top, after the |
78 |
libraries called libpcre.a and libpcreposix.a, a test program called pcretest, |
text "On a Unix system". Then run "make". It builds two libraries called |
79 |
and the pgrep command. |
libpcre.a and libpcreposix.a, a test program called pcretest, and the pgrep |
80 |
|
command. You can use "make install" to copy these, and the public header file |
81 |
To test PCRE, run the RunTest script in the pcre directory. This runs pcretest |
pcre.h, to appropriate live directories on your system. These installation |
82 |
on each of the testinput files in turn, and compares the output with the |
directories are defined at the top of the Makefile, and you should edit them if |
83 |
|
necessary. |
84 |
|
|
85 |
|
For a non-Unix system, read the comments at the top of Makefile, which give |
86 |
|
some hints on what needs to be done. PCRE has been compiled on Windows systems |
87 |
|
and on Macintoshes, but I don't know the details as I don't use those systems. |
88 |
|
It should be straightforward to build PCRE on any system that has a Standard C |
89 |
|
compiler. |
90 |
|
|
91 |
|
Some help in building a Win32 DLL of PCRE in GnuWin32 environments was |
92 |
|
contributed by Paul.Sokolovsky@technologist.com. These environments are |
93 |
|
Mingw32 (http://www.xraylith.wisc.edu/~khan/software/gnu-win32/) and |
94 |
|
CygWin (http://sourceware.cygnus.com/cygwin/). Paul comments: |
95 |
|
|
96 |
|
For CygWin, set CFLAGS=-mno-cygwin, and do 'make dll'. You'll get |
97 |
|
pcre.dll (containing pcreposix also), libpcre.dll.a, and dynamically |
98 |
|
linked pgrep and pcretest. If you have /bin/sh, run RunTest (three |
99 |
|
main test go ok, locale not supported). |
100 |
|
|
101 |
|
To test PCRE, run the RunTest script in the pcre directory. This can also be |
102 |
|
run by "make runtest". It runs the pcretest test program (which is documented |
103 |
|
below) on each of the testinput files in turn, and compares the output with the |
104 |
contents of the corresponding testoutput file. A file called testtry is used to |
contents of the corresponding testoutput file. A file called testtry is used to |
105 |
hold the output from pcretest (which is documented below). |
hold the output from pcretest. To run pcretest on just one of the test files, |
106 |
|
give its number as an argument to RunTest, for example: |
|
To run pcretest on just one of the test files, give its number as an argument |
|
|
to RunTest, for example: |
|
107 |
|
|
108 |
RunTest 3 |
RunTest 3 |
109 |
|
|
110 |
The first and third test files can also be fed directly into the perltest |
The first and third test files can also be fed directly into the perltest |
111 |
program to check that Perl gives the same results. The third file requires the |
script to check that Perl gives the same results. The third file requires the |
112 |
additional features of release 5.005, which is why it is kept separate from the |
additional features of release 5.005, which is why it is kept separate from the |
113 |
main test input, which needs only Perl 5.004. In the long run, when 5.005 is |
main test input, which needs only Perl 5.004. In the long run, when 5.005 is |
114 |
widespread, these two test files may get amalgamated. |
widespread, these two test files may get amalgamated. |
130 |
in the comparison output, it means that locale is not available on your system, |
in the comparison output, it means that locale is not available on your system, |
131 |
despite being listed by "locale". This does not mean that PCRE is broken. |
despite being listed by "locale". This does not mean that PCRE is broken. |
132 |
|
|
|
To install PCRE, copy libpcre.a to any suitable library directory (e.g. |
|
|
/usr/local/lib), pcre.h to any suitable include directory (e.g. |
|
|
/usr/local/include), and pcre.3 to any suitable man directory (e.g. |
|
|
/usr/local/man/man3). |
|
|
|
|
|
To install the pgrep command, copy it to any suitable binary directory, (e.g. |
|
|
/usr/local/bin) and pgrep.1 to any suitable man directory (e.g. |
|
|
/usr/local/man/man1). |
|
|
|
|
133 |
PCRE has its own native API, but a set of "wrapper" functions that are based on |
PCRE has its own native API, but a set of "wrapper" functions that are based on |
134 |
the POSIX API are also supplied in the library libpcreposix.a. Note that this |
the POSIX API are also supplied in the library libpcreposix.a. Note that this |
135 |
just provides a POSIX calling interface to PCRE: the regular expressions |
just provides a POSIX calling interface to PCRE: the regular expressions |
232 |
/E, and /X set PCRE_ANCHORED, PCRE_DOLLAR_ENDONLY, and PCRE_EXTRA respectively. |
/E, and /X set PCRE_ANCHORED, PCRE_DOLLAR_ENDONLY, and PCRE_EXTRA respectively. |
233 |
|
|
234 |
Searching for all possible matches within each subject string can be requested |
Searching for all possible matches within each subject string can be requested |
235 |
by the /g or /G modifier. The /g modifier behaves similarly to the way it does |
by the /g or /G modifier. After finding a match, PCRE is called again to search |
236 |
in Perl. After finding a match, PCRE is called again to search the remainder of |
the remainder of the subject string. The difference between /g and /G is that |
237 |
the subject string. The difference between /g and /G is that the former uses |
the former uses the startoffset argument to pcre_exec() to start searching at |
238 |
the start_offset argument to pcre_exec() to start searching at a new point |
a new point within the entire string (which is in effect what Perl does), |
239 |
within the entire string, whereas the latter passes over a shortened substring. |
whereas the latter passes over a shortened substring. This makes a difference |
240 |
This makes a difference to the matching process if the pattern begins with a |
to the matching process if the pattern begins with a lookbehind assertion |
241 |
lookbehind assertion (including \b or \B). |
(including \b or \B). |
242 |
|
|
243 |
|
If any call to pcre_exec() in a /g or /G sequence matches an empty string, the |
244 |
|
next call is done with the PCRE_NOTEMPTY flag set so that it cannot match an |
245 |
|
empty string again. This imitates the way Perl handles such cases when using |
246 |
|
the /g modifier or the split() function. |
247 |
|
|
248 |
There are a number of other modifiers for controlling the way pcretest |
There are a number of other modifiers for controlling the way pcretest |
249 |
operates. |
operates. |
276 |
The /S modifier causes pcre_study() to be called after the expression has been |
The /S modifier causes pcre_study() to be called after the expression has been |
277 |
compiled, and the results used when the expression is matched. |
compiled, and the results used when the expression is matched. |
278 |
|
|
279 |
The /M modifier causes information about the size of memory block used to hold |
The /M modifier causes the size of memory block used to hold the compiled |
280 |
the compile pattern to be output. |
pattern to be output. |
281 |
|
|
282 |
Finally, the /P modifier causes pcretest to call PCRE via the POSIX wrapper API |
Finally, the /P modifier causes pcretest to call PCRE via the POSIX wrapper API |
283 |
rather than its native API. When this is done, all other modifiers except /i, |
rather than its native API. When this is done, all other modifiers except /i, |
306 |
\Gdd call pcre_get_substring() for substring dd after a successful match |
\Gdd call pcre_get_substring() for substring dd after a successful match |
307 |
(any decimal number less than 32) |
(any decimal number less than 32) |
308 |
\L call pcre_get_substringlist() after a successful match |
\L call pcre_get_substringlist() after a successful match |
309 |
|
\N pass the PCRE_NOTEMPTY option to pcre_exec() |
310 |
\Odd set the size of the output vector passed to pcre_exec() to dd |
\Odd set the size of the output vector passed to pcre_exec() to dd |
311 |
(any number of decimal digits) |
(any number of decimal digits) |
312 |
\Z pass the PCRE_NOTEOL option to pcre_exec() |
\Z pass the PCRE_NOTEOL option to pcre_exec() |
399 |
input patterns can be followed only by Perl's lower case modifiers. The |
input patterns can be followed only by Perl's lower case modifiers. The |
400 |
contents of testinput1 and testinput3 meet this condition. |
contents of testinput1 and testinput3 meet this condition. |
401 |
|
|
402 |
The data lines are processed as Perl strings, so if they contain $ or @ |
The data lines are processed as Perl double-quoted strings, so if they contain |
403 |
characters, these have to be escaped. For this reason, all such characters in |
" \ $ or @ characters, these have to be escaped. For this reason, all such |
404 |
testinput1 and testinput3 are escaped so that they can be used for perltest as |
characters in testinput1 and testinput3 are escaped so that they can be used |
405 |
well as for pcretest, and the special upper case modifiers such as /A that |
for perltest as well as for pcretest, and the special upper case modifiers such |
406 |
pcretest recognizes are not used in these files. The output should be |
as /A that pcretest recognizes are not used in these files. The output should |
407 |
identical, apart from the initial identifying banner. |
be identical, apart from the initial identifying banner. |
408 |
|
|
409 |
The testinput2 and testinput4 files are not suitable for feeding to perltest, |
The testinput2 and testinput4 files are not suitable for feeding to perltest, |
410 |
since they do make use of the special upper case modifiers and escapes that |
since they do make use of the special upper case modifiers and escapes that |
413 |
them correctly. |
them correctly. |
414 |
|
|
415 |
Philip Hazel <ph10@cam.ac.uk> |
Philip Hazel <ph10@cam.ac.uk> |
416 |
June 1999 |
July 1999 |