/[pcre]/code/trunk/README
ViewVC logotype

Diff of /code/trunk/README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 123 by ph10, Mon Mar 12 15:19:06 2007 UTC revision 981 by ph10, Mon Jun 18 18:22:51 2012 UTC
# Line 1  Line 1 
1  README file for PCRE (Perl-compatible regular expression library)  README file for PCRE (Perl-compatible regular expression library)
2  -----------------------------------------------------------------  -----------------------------------------------------------------
3    
4  The latest release of PCRE is always available from  The latest release of PCRE is always available in three alternative formats
5    from:
6    
7    ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz    ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
8      ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.bz2
9      ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.zip
10    
11  There is a mailing list for discussion about the development of PCRE at  There is a mailing list for discussion about the development of PCRE at
12    
# Line 15  The contents of this README file are: Line 18  The contents of this README file are:
18    The PCRE APIs    The PCRE APIs
19    Documentation for PCRE    Documentation for PCRE
20    Contributions by users of PCRE    Contributions by users of PCRE
21    Building PCRE on non-Unix systems    Building PCRE on non-Unix-like systems
22    Building PCRE on Unix-like systems    Building PCRE without using autotools
23    Retrieving configuration information on Unix-like systems    Building PCRE using autotools
24    Shared libraries on Unix-like systems    Retrieving configuration information
25    Cross-compiling on Unix-like systems    Shared libraries
26      Cross-compiling using autotools
27    Using HP's ANSI C++ compiler (aCC)    Using HP's ANSI C++ compiler (aCC)
28      Using PCRE from MySQL
29    Making new tarballs    Making new tarballs
30    Testing PCRE    Testing PCRE
31    Character tables    Character tables
# Line 30  The contents of this README file are: Line 35  The contents of this README file are:
35  The PCRE APIs  The PCRE APIs
36  -------------  -------------
37    
38  PCRE is written in C, and it has its own API. The distribution now includes a  PCRE is written in C, and it has its own API. There are two sets of functions,
39  set of C++ wrapper functions, courtesy of Google Inc. (see the pcrecpp man page  one for the 8-bit library, which processes strings of bytes, and one for the
40  for details).  16-bit library, which processes strings of 16-bit values. The distribution also
41    includes a set of C++ wrapper functions (see the pcrecpp man page for details),
42  Also included in the distribution are a set of C wrapper functions that are  courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
43  based on the POSIX API. These end up in the library called libpcreposix. Note  C++.
44  that this just provides a POSIX calling interface to PCRE; the regular  
45  expressions themselves still follow Perl syntax and semantics. The POSIX API is  In addition, there is a set of C wrapper functions (again, just for the 8-bit
46  restricted, and does not give full access to all of PCRE's facilities.  library) that are based on the POSIX regular expression API (see the pcreposix
47    man page). These end up in the library called libpcreposix. Note that this just
48    provides a POSIX calling interface to PCRE; the regular expressions themselves
49    still follow Perl syntax and semantics. The POSIX API is restricted, and does
50    not give full access to all of PCRE's facilities.
51    
52  The header file for the POSIX-style functions is called pcreposix.h. The  The header file for the POSIX-style functions is called pcreposix.h. The
53  official POSIX name is regex.h, but I did not want to risk possible problems  official POSIX name is regex.h, but I did not want to risk possible problems
# Line 81  documentation is supplied in two other f Line 90  documentation is supplied in two other f
90       in various ways, and rooted in a file called index.html, is distributed in       in various ways, and rooted in a file called index.html, is distributed in
91       doc/html and installed in <prefix>/share/doc/pcre/html.       doc/html and installed in <prefix>/share/doc/pcre/html.
92    
93    Users of PCRE have contributed files containing the documentation for various
94    releases in CHM format. These can be found in the Contrib directory of the FTP
95    site (see next section).
96    
97    
98  Contributions by users of PCRE  Contributions by users of PCRE
99  ------------------------------  ------------------------------
# Line 91  You can find contributions from PCRE use Line 104  You can find contributions from PCRE use
104    
105  There is a README file giving brief descriptions of what they are. Some are  There is a README file giving brief descriptions of what they are. Some are
106  complete in themselves; others are pointers to URLs containing relevant files.  complete in themselves; others are pointers to URLs containing relevant files.
107  Some of this material is likely to be well out-of-date. In particular, several  Some of this material is likely to be well out-of-date. Several of the earlier
108  of the contributions provide support for compiling PCRE on various flavours of  contributions provided support for compiling PCRE on various flavours of
109  Windows (I myself do not use Windows), but nowadays there is more Windows  Windows (I myself do not use Windows). Nowadays there is more Windows support
110  support in the standard distribution.  in the standard distribution, so these contibutions have been archived.
111    
112    
113  Building PCRE on non-Unix systems  Building PCRE on non-Unix-like systems
114  ---------------------------------  --------------------------------------
115    
116  For a non-Unix system, please read the comments in the file NON-UNIX-USE,  For a non-Unix-like system, please read the comments in the file
117  though if your system supports the use of "configure" and "make" you may be  NON-AUTOTOOLS-BUILD, though if your system supports the use of "configure" and
118  able to build PCRE in the same way as for Unix-like systems.  "make" you may be able to build PCRE using autotools in the same way as for
119    many Unix-like systems.
120    
121    PCRE can also be configured using the GUI facility provided by CMake's
122    cmake-gui command. This creates Makefiles, solution files, etc. The file
123    NON-AUTOTOOLS-BUILD has information about CMake.
124    
125  PCRE has been compiled on many different operating systems. It should be  PCRE has been compiled on many different operating systems. It should be
126  straightforward to build PCRE on any system that has a Standard C compiler and  straightforward to build PCRE on any system that has a Standard C compiler and
127  library, because it uses only Standard C functions.  library, because it uses only Standard C functions.
128    
129    
130  Building PCRE on Unix-like systems  Building PCRE without using autotools
131  ----------------------------------  -------------------------------------
132    
133    The use of autotools (in particular, libtool) is problematic in some
134    environments, even some that are Unix or Unix-like. See the NON-AUTOTOOLS-BUILD
135    file for ways of building PCRE without using autotools.
136    
137    
138    Building PCRE using autotools
139    -----------------------------
140    
141  If you are using HP's ANSI C++ compiler (aCC), please see the special note  If you are using HP's ANSI C++ compiler (aCC), please see the special note
142  in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.  in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.
143    
144  To build PCRE on a Unix-like system, first run the "configure" command from the  The following instructions assume the use of the widely used "configure; make;
145  PCRE distribution directory, with your current directory set to the directory  make install" (autotools) process.
146  where you want the files to be created. This command is a standard GNU  
147  "autoconf" configuration script, for which generic instructions are supplied in  To build PCRE on system that supports autotools, first run the "configure"
148  the file INSTALL.  command from the PCRE distribution directory, with your current directory set
149    to the directory where you want the files to be created. This command is a
150    standard GNU "autoconf" configuration script, for which generic instructions
151    are supplied in the file INSTALL.
152    
153  Most commonly, people build PCRE within its own distribution directory, and in  Most commonly, people build PCRE within its own distribution directory, and in
154  this case, on many systems, just running "./configure" is sufficient. However,  this case, on many systems, just running "./configure" is sufficient. However,
# Line 127  the usual methods of changing standard d Line 156  the usual methods of changing standard d
156    
157  CFLAGS='-O2 -Wall' ./configure --prefix=/opt/local  CFLAGS='-O2 -Wall' ./configure --prefix=/opt/local
158    
159  specifies that the C compiler should be run with the flags '-O2 -Wall' instead  This command specifies that the C compiler should be run with the flags '-O2
160  of the default, and that "make install" should install PCRE under /opt/local  -Wall' instead of the default, and that "make install" should install PCRE
161  instead of the default /usr/local.  under /opt/local instead of the default /usr/local.
162    
163  If you want to build in a different directory, just run "configure" with that  If you want to build in a different directory, just run "configure" with that
164  directory as current. For example, suppose you have unpacked the PCRE source  directory as current. For example, suppose you have unpacked the PCRE source
# Line 143  possible to build it as a C++ library, t Line 172  possible to build it as a C++ library, t
172  does not have any features to support this.  does not have any features to support this.
173    
174  There are some optional features that can be included or omitted from the PCRE  There are some optional features that can be included or omitted from the PCRE
175  library. You can read more about them in the pcrebuild man page.  library. They are also documented in the pcrebuild man page.
176    
177    . By default, both shared and static libraries are built. You can change this
178      by adding one of these options to the "configure" command:
179    
180  . If you want to suppress the building of the C++ wrapper library, you can add    --disable-shared
181    --disable-cpp to the "configure" command. Otherwise, when "configure" is run,    --disable-static
182    will try to find a C++ compiler and C++ header files, and if it succeeds, it  
183    will try to build the C++ wrapper.    (See also "Shared libraries on Unix-like systems" below.)
184    
185  . If you want to make use of the support for UTF-8 character strings in PCRE,  . By default, only the 8-bit library is built. If you add --enable-pcre16 to
186    you must add --enable-utf8 to the "configure" command. Without it, the code    the "configure" command, the 16-bit library is also built. If you want only
187    for handling UTF-8 is not included in the library. (Even when included, it    the 16-bit library, use "./configure --enable-pcre16 --disable-pcre8".
188    still has to be enabled by an option at run time.)  
189    . If you are building the 8-bit library and want to suppress the building of
190  . If, in addition to support for UTF-8 character strings, you want to include    the C++ wrapper library, you can add --disable-cpp to the "configure"
191    support for the \P, \p, and \X sequences that recognize Unicode character    command. Otherwise, when "configure" is run without --disable-pcre8, it will
192    properties, you must add --enable-unicode-properties to the "configure"    try to find a C++ compiler and C++ header files, and if it succeeds, it will
193    command. This adds about 30K to the size of the library (in the form of a    try to build the C++ wrapper.
194    property table); only the basic two-letter properties such as Lu are  
195    supported.  . If you want to include support for just-in-time compiling, which can give
196      large performance improvements on certain platforms, add --enable-jit to the
197      "configure" command. This support is available only for certain hardware
198      architectures. If you try to enable it on an unsupported architecture, there
199      will be a compile time error.
200    
201    . When JIT support is enabled, pcregrep automatically makes use of it, unless
202      you add --disable-pcregrep-jit to the "configure" command.
203    
204    . If you want to make use of the support for UTF-8 Unicode character strings in
205      the 8-bit library, or UTF-16 Unicode character strings in the 16-bit library,
206      you must add --enable-utf to the "configure" command. Without it, the code
207      for handling UTF-8 and UTF-16 is not included in the relevant library. Even
208      when --enable-utf is included, the use of a UTF encoding still has to be
209      enabled by an option at run time. When PCRE is compiled with this option, its
210      input can only either be ASCII or UTF-8/16, even when running on EBCDIC
211      platforms. It is not possible to use both --enable-utf and --enable-ebcdic at
212      the same time.
213    
214    . There are no separate options for enabling UTF-8 and UTF-16 independently
215      because that would allow ridiculous settings such as requesting UTF-16
216      support while building only the 8-bit library. However, the option
217      --enable-utf8 is retained for backwards compatibility with earlier releases
218      that did not support 16-bit character strings. It is synonymous with
219      --enable-utf. It is not possible to configure one library with UTF support
220      and the other without in the same configuration.
221    
222    . If, in addition to support for UTF-8/16 character strings, you want to
223      include support for the \P, \p, and \X sequences that recognize Unicode
224      character properties, you must add --enable-unicode-properties to the
225      "configure" command. This adds about 30K to the size of the library (in the
226      form of a property table); only the basic two-letter properties such as Lu
227      are supported.
228    
229  . You can build PCRE to recognize either CR or LF or the sequence CRLF or any  . You can build PCRE to recognize either CR or LF or the sequence CRLF or any
230    of the Unicode newline sequences as indicating the end of a line. Whatever    of the preceding, or any of the Unicode newline sequences as indicating the
231    you specify at build time is the default; the caller of PCRE can change the    end of a line. Whatever you specify at build time is the default; the caller
232    selection at run time. The default newline indicator is a single LF character    of PCRE can change the selection at run time. The default newline indicator
233    (the Unix standard). You can specify the default newline indicator by adding    is a single LF character (the Unix standard). You can specify the default
234    --newline-is-cr or --newline-is-lf or --newline-is-crlf or --newline-is-any    newline indicator by adding --enable-newline-is-cr or --enable-newline-is-lf
235    to the "configure" command, respectively.    or --enable-newline-is-crlf or --enable-newline-is-anycrlf or
236      --enable-newline-is-any to the "configure" command, respectively.
237    If you specify --newline-is-cr or --newline-is-crlf, some of the standard  
238    tests will fail, because the lines in the test files end with LF. Even if    If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
239    the files are edited to change the line endings, there are likely to be some    the standard tests will fail, because the lines in the test files end with
240    failures. With --newline-is-any, many tests should succeed, but there may be    LF. Even if the files are edited to change the line endings, there are likely
241    some failures.    to be some failures. With --enable-newline-is-anycrlf or
242      --enable-newline-is-any, many tests should succeed, but there may be some
243      failures.
244    
245    . By default, the sequence \R in a pattern matches any Unicode line ending
246      sequence. This is independent of the option specifying what PCRE considers to
247      be the end of a line (see above). However, the caller of PCRE can restrict \R
248      to match only CR, LF, or CRLF. You can make this the default by adding
249      --enable-bsr-anycrlf to the "configure" command (bsr = "backslash R").
250    
251  . When called via the POSIX interface, PCRE uses malloc() to get additional  . When called via the POSIX interface, PCRE uses malloc() to get additional
252    storage for processing capturing parentheses if there are more than 10 of    storage for processing capturing parentheses if there are more than 10 of
253    them. You can increase this threshold by setting, for example,    them in a pattern. You can increase this threshold by setting, for example,
254    
255    --with-posix-malloc-threshold=20    --with-posix-malloc-threshold=20
256    
# Line 205  library. You can read more about them in Line 277  library. You can read more about them in
277    sizes in the pcrestack man page.    sizes in the pcrestack man page.
278    
279  . The default maximum compiled pattern size is around 64K. You can increase  . The default maximum compiled pattern size is around 64K. You can increase
280    this by adding --with-link-size=3 to the "configure" command. You can    this by adding --with-link-size=3 to the "configure" command. In the 8-bit
281    increase it even more by setting --with-link-size=4, but this is unlikely    library, PCRE then uses three bytes instead of two for offsets to different
282    ever to be necessary.    parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
283      the same as --with-link-size=4, which (in both libraries) uses four-byte
284      offsets. Increasing the internal link size reduces performance.
285    
286  . You can build PCRE so that its internal match() function that is called from  . You can build PCRE so that its internal match() function that is called from
287    pcre_exec() does not call itself recursively. Instead, it uses memory blocks    pcre_exec() does not call itself recursively. Instead, it uses memory blocks
# Line 219  library. You can read more about them in Line 293  library. You can read more about them in
293    
294    on the "configure" command. PCRE runs more slowly in this mode, but it may be    on the "configure" command. PCRE runs more slowly in this mode, but it may be
295    necessary in environments with limited stack sizes. This applies only to the    necessary in environments with limited stack sizes. This applies only to the
296    pcre_exec() function; it does not apply to pcre_dfa_exec(), which does not    normal execution of the pcre_exec() function; if JIT support is being
297    use deeply nested recursion. There is a discussion about stack sizes in the    successfully used, it is not relevant. Equally, it does not apply to
298    pcrestack man page.    pcre_dfa_exec(), which does not use deeply nested recursion. There is a
299      discussion about stack sizes in the pcrestack man page.
300    
301    . For speed, PCRE uses four tables for manipulating and identifying characters
302      whose code point values are less than 256. By default, it uses a set of
303      tables for ASCII encoding that is part of the distribution. If you specify
304    
305      --enable-rebuild-chartables
306    
307      a program called dftables is compiled and run in the default C locale when
308      you obey "make". It builds a source file called pcre_chartables.c. If you do
309      not specify this option, pcre_chartables.c is created as a copy of
310      pcre_chartables.c.dist. See "Character tables" below for further information.
311    
312    . It is possible to compile PCRE for use on systems that use EBCDIC as their
313      character code (as opposed to ASCII) by specifying
314    
315      --enable-ebcdic
316    
317      This automatically implies --enable-rebuild-chartables (see above). However,
318      when PCRE is built this way, it always operates in EBCDIC. It cannot support
319      both EBCDIC and UTF-8/16.
320    
321    . The pcregrep program currently supports only 8-bit data files, and so
322      requires the 8-bit PCRE library. It is possible to compile pcregrep to use
323      libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by
324      specifying one or both of
325    
326      --enable-pcregrep-libz
327      --enable-pcregrep-libbz2
328    
329      Of course, the relevant libraries must be installed on your system.
330    
331    . The default size of internal buffer used by pcregrep can be set by, for
332      example:
333    
334      --with-pcregrep-bufsize=50K
335    
336      The default value is 20K.
337    
338    . It is possible to compile pcretest so that it links with the libreadline
339      or libedit libraries, by specifying, respectively,
340    
341      --enable-pcretest-libreadline or --enable-pcretest-libedit
342    
343      If this is done, when pcretest's input is from a terminal, it reads it using
344      the readline() function. This provides line-editing and history facilities.
345      Note that libreadline is GPL-licenced, so if you distribute a binary of
346      pcretest linked in this way, there may be licensing issues. These can be
347      avoided by linking with libedit (which has a BSD licence) instead.
348    
349      Enabling libreadline causes the -lreadline option to be added to the pcretest
350      build. In many operating environments with a sytem-installed readline
351      library this is sufficient. However, in some environments (e.g. if an
352      unmodified distribution version of readline is in use), it may be necessary
353      to specify something like LIBS="-lncurses" as well. This is because, to quote
354      the readline INSTALL, "Readline uses the termcap functions, but does not link
355      with the termcap or curses library itself, allowing applications which link
356      with readline the to choose an appropriate library." If you get error
357      messages about missing functions tgetstr, tgetent, tputs, tgetflag, or tgoto,
358      this is the problem, and linking with the ncurses library should fix it.
359    
360  The "configure" script builds the following files for the basic C library:  The "configure" script builds the following files for the basic C library:
361    
362  . Makefile is the makefile that builds the library  . Makefile             the makefile that builds the library
363  . config.h contains build-time configuration options for the library  . config.h             build-time configuration options for the library
364  . pcre.h is the public PCRE header file  . pcre.h               the public PCRE header file
365  . pcre-config is a script that shows the settings of "configure" options  . pcre-config          script that shows the building settings such as CFLAGS
366  . libpcre.pc is data for the pkg-config command                           that were set for "configure"
367  . libtool is a script that builds shared and/or static libraries  . libpcre.pc         ) data for the pkg-config command
368  . RunTest is a script for running tests on the basic C library  . libpcre16.pc       )
369  . RunGrepTest is a script for running tests on the pcregrep command  . libpcreposix.pc    )
370    . libtool              script that builds shared and/or static libraries
371  Versions of config.h and pcre.h are distributed in the PCRE tarballs under  
372  the names config.h.generic and pcre.h.generic. These are provided for the  Versions of config.h and pcre.h are distributed in the PCRE tarballs under the
373  benefit of those who have to built PCRE without the benefit of "configure". If  names config.h.generic and pcre.h.generic. These are provided for those who
374  you use "configure", the .generic versions are not used.  have to built PCRE without using "configure" or CMake. If you use "configure"
375    or CMake, the .generic versions are not used.
376  If a C++ compiler is found, the following files are also built:  
377    When building the 8-bit library, if a C++ compiler is found, the following
378  . libpcrecpp.pc is data for the pkg-config command  files are also built:
379  . pcrecpparg.h is a header file for programs that call PCRE via the C++ wrapper  
380  . pcre_stringpiece.h is the header for the C++ "stringpiece" functions  . libpcrecpp.pc        data for the pkg-config command
381    . pcrecpparg.h         header file for calling PCRE via the C++ wrapper
382    . pcre_stringpiece.h   header for the C++ "stringpiece" functions
383    
384  The "configure" script also creates config.status, which is an executable  The "configure" script also creates config.status, which is an executable
385  script that can be run to recreate the configuration, and config.log, which  script that can be run to recreate the configuration, and config.log, which
386  contains compiler output from tests that "configure" runs.  contains compiler output from tests that "configure" runs.
387    
388  Once "configure" has run, you can run "make". It builds two libraries, called  Once "configure" has run, you can run "make". This builds either or both of the
389  libpcre and libpcreposix, a test program called pcretest, a demonstration  libraries libpcre and libpcre16, and a test program called pcretest. If you
390  program called pcredemo, and the pcregrep command. If a C++ compiler was found  enabled JIT support with --enable-jit, a test program called pcre_jit_test is
391  on your system, "make" also builds the C++ wrapper library, which is called  built as well.
392  libpcrecpp, and some test programs called pcrecpp_unittest,  
393  pcre_scanner_unittest, and pcre_stringpiece_unittest. Building the C++ wrapper  If the 8-bit library is built, libpcreposix and the pcregrep command are also
394  can be disabled by adding --disable-cpp to the "configure" command.  built, and if a C++ compiler was found on your system, and you did not disable
395    it with --disable-cpp, "make" builds the C++ wrapper library, which is called
396    libpcrecpp, as well as some test programs called pcrecpp_unittest,
397    pcre_scanner_unittest, and pcre_stringpiece_unittest.
398    
399  The command "make check" runs all the appropriate tests. Details of the PCRE  The command "make check" runs all the appropriate tests. Details of the PCRE
400  tests are given below in a separate section of this document.  tests are given below in a separate section of this document.
# Line 266  system. The following are installed (fil Line 405  system. The following are installed (fil
405    
406    Commands (bin):    Commands (bin):
407      pcretest      pcretest
408      pcregrep      pcregrep (if 8-bit support is enabled)
409      pcre-config      pcre-config
410    
411    Libraries (lib):    Libraries (lib):
412      libpcre      libpcre16     (if 16-bit support is enabled)
413      libpcreposix      libpcre       (if 8-bit support is enabled)
414      libpcrecpp (if C++ support is enabled)      libpcreposix  (if 8-bit support is enabled)
415        libpcrecpp    (if 8-bit and C++ support is enabled)
416    
417    Configuration information (lib/pkgconfig):    Configuration information (lib/pkgconfig):
418        libpcre16.pc
419      libpcre.pc      libpcre.pc
420        libpcreposix.pc
421      libpcrecpp.pc (if C++ support is enabled)      libpcrecpp.pc (if C++ support is enabled)
422    
423    Header files (include):    Header files (include):
# Line 289  system. The following are installed (fil Line 431  system. The following are installed (fil
431    Man pages (share/man/man{1,3}):    Man pages (share/man/man{1,3}):
432      pcregrep.1      pcregrep.1
433      pcretest.1      pcretest.1
434        pcre-config.1
435      pcre.3      pcre.3
436      pcre*.3 (lots more pages, all starting "pcre")      pcre*.3 (lots more pages, all starting "pcre")
437    
# Line 303  system. The following are installed (fil Line 446  system. The following are installed (fil
446      LICENCE      LICENCE
447      NEWS      NEWS
448      README      README
449      pcre.txt       (a concatenation of the man(3) pages)      pcre.txt         (a concatenation of the man(3) pages)
450      pcretest.txt   the pcretest man page      pcretest.txt     the pcretest man page
451      pcregrep.txt   the pcregrep man page      pcregrep.txt     the pcregrep man page
452        pcre-config.txt  the pcre-config man page
 Note that the pcredemo program that is built by "configure" is *not* installed  
 anywhere. It is a demonstration for programmers wanting to use PCRE.  
453    
454  If you want to remove PCRE from your system, you can run "make uninstall".  If you want to remove PCRE from your system, you can run "make uninstall".
455  This removes all the files that "make install" installed. However, it does not  This removes all the files that "make install" installed. However, it does not
456  remove any directories, because these are often shared with other programs.  remove any directories, because these are often shared with other programs.
457    
458    
459  Retrieving configuration information on Unix-like systems  Retrieving configuration information
460  ---------------------------------------------------------  ------------------------------------
461    
462  Running "make install" installs the command pcre-config, which can be used to  Running "make install" installs the command pcre-config, which can be used to
463  recall information about the PCRE configuration and installation. For example:  recall information about the PCRE configuration and installation. For example:
# Line 341  The data is held in *.pc files that are Line 482  The data is held in *.pc files that are
482  <prefix>/lib/pkgconfig.  <prefix>/lib/pkgconfig.
483    
484    
485  Shared libraries on Unix-like systems  Shared libraries
486  -------------------------------------  ----------------
487    
488  The default distribution builds PCRE as shared libraries and static libraries,  The default distribution builds PCRE as shared libraries and static libraries,
489  as long as the operating system supports shared libraries. Shared library  as long as the operating system supports shared libraries. Shared library
# Line 367  Then run "make" in the usual way. Simila Line 508  Then run "make" in the usual way. Simila
508  build only shared libraries.  build only shared libraries.
509    
510    
511  Cross-compiling on Unix-like systems  Cross-compiling using autotools
512  ------------------------------------  -------------------------------
513    
514  You can specify CC and CFLAGS in the normal way to the "configure" command, in  You can specify CC and CFLAGS in the normal way to the "configure" command, in
515  order to cross-compile PCRE for some other host. However, during the building  order to cross-compile PCRE for some other host. However, you should NOT
516  process, the dftables.c source file is compiled *and run* on the local host, in  specify --enable-rebuild-chartables, because if you do, the dftables.c source
517  order to generate the default character tables (the chartables.c file). It  file is compiled and run on the local host, in order to generate the inbuilt
518  therefore needs to be compiled with the local compiler, not the cross compiler.  character tables (the pcre_chartables.c file). This will probably not work,
519  You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD;  because dftables.c needs to be compiled with the local compiler, not the cross
520  there are also CXX_FOR_BUILD and CXXFLAGS_FOR_BUILD for the C++ wrapper)  compiler.
521  when calling the "configure" command. If they are not specified, they default  
522  to the values of CC and CFLAGS.  When --enable-rebuild-chartables is not specified, pcre_chartables.c is created
523    by making a copy of pcre_chartables.c.dist, which is a default set of tables
524    that assumes ASCII code. Cross-compiling with the default tables should not be
525    a problem.
526    
527    If you need to modify the character tables when cross-compiling, you should
528    move pcre_chartables.c.dist out of the way, then compile dftables.c by hand and
529    run it on the local host to make a new version of pcre_chartables.c.dist.
530    Then when you cross-compile PCRE this new version of the tables will be used.
531    
532    
533  Using HP's ANSI C++ compiler (aCC)  Using HP's ANSI C++ compiler (aCC)
# Line 397  running the "configure" script: Line 546  running the "configure" script:
546    CXXLDFLAGS="-lstd_v2 -lCsup_v2"    CXXLDFLAGS="-lstd_v2 -lCsup_v2"
547    
548    
549    Using Sun's compilers for Solaris
550    ---------------------------------
551    
552    A user reports that the following configurations work on Solaris 9 sparcv9 and
553    Solaris 9 x86 (32-bit):
554    
555      Solaris 9 sparcv9: ./configure --disable-cpp CC=/bin/cc CFLAGS="-m64 -g"
556      Solaris 9 x86:     ./configure --disable-cpp CC=/bin/cc CFLAGS="-g"
557    
558    
559    Using PCRE from MySQL
560    ---------------------
561    
562    On systems where both PCRE and MySQL are installed, it is possible to make use
563    of PCRE from within MySQL, as an alternative to the built-in pattern matching.
564    There is a web page that tells you how to do this:
565    
566      http://www.mysqludf.org/lib_mysqludf_preg/index.php
567    
568    
569  Making new tarballs  Making new tarballs
570  -------------------  -------------------
571    
572  The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and  The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and
573  zip formats. However, if you have modified any of the man page sources in the  zip formats. The command "make distcheck" does the same, but then does a trial
574  doc directory, you should first run the PrepareRelease script. This re-creates  build of the new distribution to ensure that it works.
575  the .txt and HTML forms of the documentation from the man pages.  
576    If you have modified any of the man page sources in the doc directory, you
577    should first run the PrepareRelease script before making a distribution. This
578    script creates the .txt and HTML forms of the documentation from the man pages.
579    
580    
581  Testing PCRE  Testing PCRE
582  ------------  ------------
583    
584  To test the basic PCRE library on a Unix system, run the RunTest script that is  To test the basic PCRE library on a Unix-like system, run the RunTest script.
585  created by the configuring process. There is also a script called RunGrepTest  There is another script called RunGrepTest that tests the options of the
586  that tests the options of the pcregrep command. If the C++ wrapper library is  pcregrep command. If the C++ wrapper library is built, three test programs
587  built, three test programs called pcrecpp_unittest, pcre_scanner_unittest, and  called pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest
588  pcre_stringpiece_unittest are also built.  are also built. When JIT support is enabled, another test program called
589    pcre_jit_test is built.
590    
591  Both the scripts and all the program tests are run if you obey "make check" or  Both the scripts and all the program tests are run if you obey "make check" or
592  "make test". For other systems, see the instructions in NON-UNIX-USE.  "make test". For other environments, see the instructions in
593    NON-AUTOTOOLS-BUILD.
594    
595  The RunTest script runs the pcretest test program (which is documented in its  The RunTest script runs the pcretest test program (which is documented in its
596  own man page) on each of the testinput files in the testdata directory in  own man page) on each of the relevant testinput files in the testdata
597  turn, and compares the output with the contents of the corresponding testoutput  directory, and compares the output with the contents of the corresponding
598  files. A file called testtry is used to hold the main output from pcretest  testoutput files. Some tests are relevant only when certain build-time options
599  (testsavedregex is also used as a working file). To run pcretest on just one of  were selected. For example, the tests for UTF-8/16 support are run only if
600  the test files, give its number as an argument to RunTest, for example:  --enable-utf was used. RunTest outputs a comment when it skips a test.
601    
602    RunTest 2  Many of the tests that are not skipped are run up to three times. The second
603    run forces pcre_study() to be called for all patterns except for a few in some
604  The first test file can also be fed directly into the perltest.pl script to  tests that are marked "never study" (see the pcretest program for how this is
605  check that Perl gives the same results. The only difference you should see is  done). If JIT support is available, the non-DFA tests are run a third time,
606  in the first few lines, where the Perl version is given instead of the PCRE  this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.
607  version.  
608    When both 8-bit and 16-bit support is enabled, the entire set of tests is run
609    twice, once for each library. If you want to run just one set of tests, call
610    RunTest with either the -8 or -16 option.
611    
612    RunTest uses a file called testtry to hold the main output from pcretest.
613    Other files whose names begin with "test" are used as working files in some
614    tests. To run pcretest on just one or more specific test files, give their
615    numbers as arguments to RunTest, for example:
616    
617      RunTest 2 7 11
618    
619    You can also call RunTest with the single argument "list" to cause it to output
620    a list of tests.
621    
622    The first test file can be fed directly into the perltest.pl script to check
623    that Perl gives the same results. The only difference you should see is in the
624    first few lines, where the Perl version is given instead of the PCRE version.
625    
626  The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(),  The second set of tests check pcre_fullinfo(), pcre_study(),
627  pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error  pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error
628  detection, and run-time flags that are specific to PCRE, as well as the POSIX  detection, and run-time flags that are specific to PCRE, as well as the POSIX
629  wrapper API. It also uses the debugging flags to check some of the internals of  wrapper API. It also uses the debugging flags to check some of the internals of
# Line 461  is output to say why. If running this te Line 652  is output to say why. If running this te
652  in the comparison output, it means that locale is not available on your system,  in the comparison output, it means that locale is not available on your system,
653  despite being listed by "locale". This does not mean that PCRE is broken.  despite being listed by "locale". This does not mean that PCRE is broken.
654    
655  The fourth test checks the UTF-8 support. It is not run automatically unless  [If you are trying to run this test on Windows, you may be able to get it to
656  PCRE is built with UTF-8 support. To do this you must set --enable-utf8 when  work by changing "fr_FR" to "french" everywhere it occurs. Alternatively, use
657  running "configure". This file can be also fed directly to the perltest script,  RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses
658  provided you are running Perl 5.8 or higher. (For Perl 5.6, a small patch,  Windows versions of test 2. More info on using RunTest.bat is included in the
659  commented in the script, can be be used.)  document entitled NON-UNIX-USE.]
660    
661  The fifth test checks error handling with UTF-8 encoding, and internal UTF-8  The fourth and fifth tests check the UTF-8/16 support and error handling and
662  features of PCRE that are not relevant to Perl.  internal UTF features of PCRE that are not relevant to Perl, respectively. The
663    sixth and seventh tests do the same for Unicode character properties support.
664  The sixth test checks the support for Unicode character properties. It it not  
665  run automatically unless PCRE is built with Unicode property support. To to  The eighth, ninth, and tenth tests check the pcre_dfa_exec() alternative
666  this you must set --enable-unicode-properties when running "configure".  matching function, in non-UTF-8/16 mode, UTF-8/16 mode, and UTF-8/16 mode with
667    Unicode property support, respectively.
668  The seventh, eighth, and ninth tests check the pcre_dfa_exec() alternative  
669  matching function, in non-UTF-8 mode, UTF-8 mode, and UTF-8 mode with Unicode  The eleventh test checks some internal offsets and code size features; it is
670  property support, respectively. The eighth and ninth tests are not run  run only when the default "link size" of 2 is set (in other cases the sizes
671  automatically unless PCRE is build with the relevant support.  change) and when Unicode property support is enabled.
672    
673    The twelfth test is run only when JIT support is available, and the thirteenth
674    test is run only when JIT support is not available. They test some JIT-specific
675    features such as information output from pcretest about JIT compilation.
676    
677    The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
678    the seventeenth, eighteenth, and nineteenth tests are run only in 16-bit mode.
679    These are tests that generate different output in the two modes. They are for
680    general cases, UTF-8/16 support, and Unicode property support, respectively.
681    
682    The twentieth test is run only in 16-bit mode. It tests some specific 16-bit
683    features of the DFA matching engine.
684    
685    The twenty-first and twenty-second tests are run only in 16-bit mode, when the
686    link size is set to 2. They test reloading pre-compiled patterns.
687    
688    
689  Character tables  Character tables
# Line 490  concatenated tables. A call to pcre_make Line 696  concatenated tables. A call to pcre_make
696  of tables in the current locale. If the final argument for pcre_compile() is  of tables in the current locale. If the final argument for pcre_compile() is
697  passed as NULL, a set of default tables that is built into the binary is used.  passed as NULL, a set of default tables that is built into the binary is used.
698    
699  The source file called chartables.c contains the default set of tables. This is  The source file called pcre_chartables.c contains the default set of tables. By
700  not supplied in the distribution, but is built by the program dftables  default, this is created as a copy of pcre_chartables.c.dist, which contains
701  (compiled from dftables.c), which uses the ANSI C character handling functions  tables for ASCII coding. However, if --enable-rebuild-chartables is specified
702  such as isalnum(), isalpha(), isupper(), islower(), etc. to build the table  for ./configure, a different version of pcre_chartables.c is built by the
703  sources. This means that the default C locale which is set for your system will  program dftables (compiled from dftables.c), which uses the ANSI C character
704  control the contents of these default tables. You can change the default tables  handling functions such as isalnum(), isalpha(), isupper(), islower(), etc. to
705  by editing chartables.c and then re-building PCRE. If you do this, you should  build the table sources. This means that the default C locale which is set for
706  take care to ensure that the file does not get automaticaly re-generated.  your system will control the contents of these default tables. You can change
707    the default tables by editing pcre_chartables.c and then re-building PCRE. If
708    you do this, you should take care to ensure that the file does not get
709    automatically re-generated. The best way to do this is to move
710    pcre_chartables.c.dist out of the way and replace it with your customized
711    tables.
712    
713    When the dftables program is run as a result of --enable-rebuild-chartables,
714    it uses the default C locale that is set on your system. It does not pay
715    attention to the LC_xxx environment variables. In other words, it uses the
716    system's default locale rather than whatever the compiling user happens to have
717    set. If you really do want to build a source set of character tables in a
718    locale that is specified by the LC_xxx variables, you can run the dftables
719    program by hand with the -L option. For example:
720    
721      ./dftables -L pcre_chartables.c.special
722    
723  The first two 256-byte tables provide lower casing and case flipping functions,  The first two 256-byte tables provide lower casing and case flipping functions,
724  respectively. The next table consists of three 32-byte bit maps which identify  respectively. The next table consists of three 32-byte bit maps which identify
# Line 522  will cause PCRE to malfunction. Line 743  will cause PCRE to malfunction.
743  File manifest  File manifest
744  -------------  -------------
745    
746  The distribution should contain the following files:  The distribution should contain the files listed below. Where a file name is
747    given as pcre[16]_xxx it means that there are two files, one with the name
748    pcre_xxx and the other with the name pcre16_xxx.
749    
750  (A) Source files of the PCRE library functions and their headers:  (A) Source files of the PCRE library functions and their headers:
751    
752    dftables.c             auxiliary program for building chartables.c    dftables.c              auxiliary program for building pcre_chartables.c
753                                when --enable-rebuild-chartables is specified
754    
755    pcreposix.c            )    pcre_chartables.c.dist  a default set of character tables that assume ASCII
756    pcre_compile.c         )                              coding; used, unless --enable-rebuild-chartables is
757    pcre_config.c          )                              specified, by copying to pcre[16]_chartables.c
758    pcre_dfa_exec.c        )  
759    pcre_exec.c            )    pcreposix.c             )
760    pcre_fullinfo.c        )    pcre[16]_byte_order.c   )
761    pcre_get.c             ) sources for the functions in the library,    pcre[16]_compile.c      )
762    pcre_globals.c         )   and some internal functions that they use    pcre[16]_config.c       )
763    pcre_info.c            )    pcre[16]_dfa_exec.c     )
764    pcre_maketables.c      )    pcre[16]_exec.c         )
765    pcre_newline.c         )    pcre[16]_fullinfo.c     )
766    pcre_ord2utf8.c        )    pcre[16]_get.c          ) sources for the functions in the library,
767    pcre_refcount.c        )    pcre[16]_globals.c      )   and some internal functions that they use
768    pcre_study.c           )    pcre[16]_jit_compile.c  )
769    pcre_tables.c          )    pcre[16]_maketables.c   )
770    pcre_try_flipped.c     )    pcre[16]_newline.c      )
771    pcre_ucp_searchfuncs.c )    pcre[16]_refcount.c     )
772    pcre_valid_utf8.c      )    pcre[16]_string_utils.c )
773    pcre_version.c         )    pcre[16]_study.c        )
774    pcre_xclass.c          )    pcre[16]_tables.c       )
775    pcre_printint.src      ) debugging function that is #included in pcretest,    pcre[16]_ucd.c          )
776                           )   and can also be #included in pcre_compile()    pcre[16]_version.c      )
777    pcre.h.in              template for pcre.h when built by "configure"    pcre[16]_xclass.c       )
778    pcreposix.h            header for the external POSIX wrapper API    pcre_ord2utf8.c         )
779    pcre_internal.h        header for internal use    pcre_valid_utf8.c       )
780    ucp.h                  ) headers concerned with    pcre16_ord2utf16.c      )
781    ucpinternal.h          )   Unicode property handling    pcre16_utf16_utils.c    )
782    ucptable.h             ) (this one is the data table)    pcre16_valid_utf16.c    )
783    
784    config.h.in            template for config.h, which is built by "configure"    pcre[16]_printint.c     ) debugging function that is used by pcretest,
785                              )   and can also be #included in pcre_compile()
786    pcrecpp.h              public header file for the C++ wrapper  
787    pcrecpparg.h.in        template for another C++ header file    pcre.h.in               template for pcre.h when built by "configure"
788    pcre_scanner.h         public header file for C++ scanner functions    pcreposix.h             header for the external POSIX wrapper API
789    pcrecpp.cc             )    pcre_internal.h         header for internal use
790    pcre_scanner.cc        ) source for the C++ wrapper library    sljit/*                 16 files that make up the JIT compiler
791      ucp.h                   header for Unicode property handling
792    pcre_stringpiece.h.in  template for pcre_stringpiece.h, the header for the  
793                             C++ stringpiece functions    config.h.in             template for config.h, which is built by "configure"
794    pcre_stringpiece.cc    source for the C++ stringpiece functions  
795      pcrecpp.h               public header file for the C++ wrapper
796      pcrecpparg.h.in         template for another C++ header file
797      pcre_scanner.h          public header file for C++ scanner functions
798      pcrecpp.cc              )
799      pcre_scanner.cc         ) source for the C++ wrapper library
800    
801      pcre_stringpiece.h.in   template for pcre_stringpiece.h, the header for the
802                                C++ stringpiece functions
803      pcre_stringpiece.cc     source for the C++ stringpiece functions
804    
805  (B) Source files for programs that use PCRE:  (B) Source files for programs that use PCRE:
806    
807    pcredemo.c             simple demonstration of coding calls to PCRE    pcredemo.c              simple demonstration of coding calls to PCRE
808    pcregrep.c             source of a grep utility that uses PCRE    pcregrep.c              source of a grep utility that uses PCRE
809    pcretest.c             comprehensive test program    pcretest.c              comprehensive test program
810    
811  (C) Auxiliary files:  (C) Auxiliary files:
812    
813    132html                script to turn "man" pages into HTML    132html                 script to turn "man" pages into HTML
814    AUTHORS                information about the author of PCRE    AUTHORS                 information about the author of PCRE
815    ChangeLog              log of changes to the code    ChangeLog               log of changes to the code
816    CleanTxt               script to clean nroff output for txt man pages    CleanTxt                script to clean nroff output for txt man pages
817    Detrail                script to remove trailing spaces    Detrail                 script to remove trailing spaces
818    Index.html             the base HTML page    HACKING                 some notes about the internals of PCRE
819    INSTALL                generic installation instructions    INSTALL                 generic installation instructions
820    LICENCE                conditions for the use of PCRE    LICENCE                 conditions for the use of PCRE
821    COPYING                the same, using GNU's standard name    COPYING                 the same, using GNU's standard name
822    Makefile.in            ) template for Unix Makefile, which is built by    Makefile.in             ) template for Unix Makefile, which is built by
823                           )   "configure"                            )   "configure"
824    Makefile.am            ) the automake input that was used to create    Makefile.am             ) the automake input that was used to create
825                           )   Makefile.in                            )   Makefile.in
826    NEWS                   important changes in this release    NEWS                    important changes in this release
827    NON-UNIX-USE           notes on building PCRE on non-Unix systems    NON-UNIX-USE            the previous name for NON-AUTOTOOLS-BUILD
828    PrepareRelease         script to make preparations for "make dist"    NON-AUTOTOOLS-BUILD     notes on building PCRE without using autotools
829    README                 this file    PrepareRelease          script to make preparations for "make dist"
830    RunTest.in             template for a Unix shell script for running tests    README                  this file
831    RunGrepTest.in         template for a Unix shell script for pcregrep tests    RunTest                 a Unix shell script for running tests
832    aclocal.m4             m4 macros (generated by "aclocal")    RunGrepTest             a Unix shell script for pcregrep tests
833    config.guess           ) files used by libtool,    aclocal.m4              m4 macros (generated by "aclocal")
834    config.sub             )   used only when building a shared library    config.guess            ) files used by libtool,
835    configure              a configuring shell script (built by autoconf)    config.sub              )   used only when building a shared library
836    configure.ac           ) the autoconf input that was used to build    configure               a configuring shell script (built by autoconf)
837                           )   "configure" and config.h    configure.ac            ) the autoconf input that was used to build
838    depcomp                ) script to find program dependencies, generated by                            )   "configure" and config.h
839                           )   automake    depcomp                 ) script to find program dependencies, generated by
840    doc/*.3                man page sources for the PCRE functions                            )   automake
841    doc/*.1                man page sources for pcregrep and pcretest    doc/*.3                 man page sources for PCRE
842    doc/html/*             HTML documentation    doc/*.1                 man page sources for pcregrep and pcretest
843    doc/pcre.txt           plain text version of the man pages    doc/index.html.src      the base HTML page
844    doc/pcretest.txt       plain text documentation of test program    doc/html/*              HTML documentation
845    doc/perltest.txt       plain text documentation of Perl test program    doc/pcre.txt            plain text version of the man pages
846    install-sh             a shell script for installing files    doc/pcretest.txt        plain text documentation of test program
847    libpcre.pc.in          template for libpcre.pc for pkg-config    doc/perltest.txt        plain text documentation of Perl test program
848    libpcrecpp.pc.in       template for libpcrecpp.pc for pkg-config    install-sh              a shell script for installing files
849    ltmain.sh              file used to build a libtool script    libpcre16.pc.in         template for libpcre16.pc for pkg-config
850    missing                ) common stub for a few missing GNU programs while    libpcre.pc.in           template for libpcre.pc for pkg-config
851                           )   installing, generated by automake    libpcreposix.pc.in      template for libpcreposix.pc for pkg-config
852    mkinstalldirs          script for making install directories    libpcrecpp.pc.in        template for libpcrecpp.pc for pkg-config
853    perltest.pl            Perl test program    ltmain.sh               file used to build a libtool script
854    pcre-config.in         source of script which retains PCRE information    missing                 ) common stub for a few missing GNU programs while
855                              )   installing, generated by automake
856      mkinstalldirs           script for making install directories
857      perltest.pl             Perl test program
858      pcre-config.in          source of script which retains PCRE information
859      pcre_jit_test.c         test program for the JIT compiler
860    pcrecpp_unittest.cc          )    pcrecpp_unittest.cc          )
861    pcre_scanner_unittest.cc     ) test programs for the C++ wrapper    pcre_scanner_unittest.cc     ) test programs for the C++ wrapper
862    pcre_stringpiece_unittest.cc )    pcre_stringpiece_unittest.cc )
863    testdata/testinput*    test data for main library tests    testdata/testinput*     test data for main library tests
864    testdata/testoutput*   expected test results    testdata/testoutput*    expected test results
865    testdata/grep*         input and output for pcregrep tests    testdata/grep*          input and output for pcregrep tests
866      testdata/*              other supporting test files
867    
868  (D) Auxiliary files for cmake support  (D) Auxiliary files for cmake support
869    
870      cmake/COPYING-CMAKE-SCRIPTS
871      cmake/FindPackageHandleStandardArgs.cmake
872      cmake/FindEditline.cmake
873      cmake/FindReadline.cmake
874    CMakeLists.txt    CMakeLists.txt
875    config-cmake.h.in    config-cmake.h.in
876    
877  (E) Auxiliary files for VPASCAL  (E) Auxiliary files for VPASCAL
878    
879    makevp.bat    makevp.bat
880    !compile.txt    makevp_c.txt
881    !linklib.txt    makevp_l.txt
882    pcregexp.pas    pcregexp.pas
883    
884  (F) Auxiliary files for building PCRE "by hand"  (F) Auxiliary files for building PCRE "by hand"
885    
886    pcre.h.generic         ) a version of the public PCRE header file    pcre.h.generic          ) a version of the public PCRE header file
887                           )   for use in non-"configure" environments                            )   for use in non-"configure" environments
888    config.h.generic       ) a version of config.h for use in non-"configure"    config.h.generic        ) a version of config.h for use in non-"configure"
889                           )   environments                            )   environments
890    
891  (F) Miscellaneous  (F) Miscellaneous
892    
# Line 652  The distribution should contain the foll Line 895  The distribution should contain the foll
895  Philip Hazel  Philip Hazel
896  Email local part: ph10  Email local part: ph10
897  Email domain: cam.ac.uk  Email domain: cam.ac.uk
898  Last updated: March 2007  Last updated: 18 June 2012

Legend:
Removed from v.123  
changed lines
  Added in v.981

  ViewVC Help
Powered by ViewVC 1.1.5