/[pcre]/code/trunk/README
ViewVC logotype

Diff of /code/trunk/README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 75 by nigel, Sat Feb 24 21:40:37 2007 UTC revision 87 by nigel, Sat Feb 24 21:41:21 2007 UTC
# Line 7  The latest release of PCRE is always ava Line 7  The latest release of PCRE is always ava
7    
8  Please read the NEWS file if you are upgrading from a previous release.  Please read the NEWS file if you are upgrading from a previous release.
9    
10  PCRE has its own native API, but a set of "wrapper" functions that are based on  
11  the POSIX API are also supplied in the library libpcreposix. Note that this  The PCRE APIs
12  just provides a POSIX calling interface to PCRE: the regular expressions  -------------
13  themselves still follow Perl syntax and semantics. The header file  
14  for the POSIX-style functions is called pcreposix.h. The official POSIX name is  PCRE is written in C, and it has its own API. The distribution now includes a
15  regex.h, but I didn't want to risk possible problems with existing files of  set of C++ wrapper functions, courtesy of Google Inc. (see the pcrecpp man page
16  that name by distributing it that way. To use it with an existing program that  for details).
17  uses the POSIX API, it will have to be renamed or pointed at by a link.  
18    Also included are a set of C wrapper functions that are based on the POSIX
19    API. These end up in the library called libpcreposix. Note that this just
20    provides a POSIX calling interface to PCRE: the regular expressions themselves
21    still follow Perl syntax and semantics. The header file for the POSIX-style
22    functions is called pcreposix.h. The official POSIX name is regex.h, but I
23    didn't want to risk possible problems with existing files of that name by
24    distributing it that way. To use it with an existing program that uses the
25    POSIX API, it will have to be renamed or pointed at by a link.
26    
27  If you are using the POSIX interface to PCRE and there is already a POSIX regex  If you are using the POSIX interface to PCRE and there is already a POSIX regex
28  library installed on your system, you must take care when linking programs to  library installed on your system, you must take care when linking programs to
# Line 60  others are pointers to URLs containing r Line 68  others are pointers to URLs containing r
68  Building PCRE on a Unix-like system  Building PCRE on a Unix-like system
69  -----------------------------------  -----------------------------------
70    
71    If you are using HP's ANSI C++ compiler (aCC), please see the special note
72    in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.
73    
74  To build PCRE on a Unix-like system, first run the "configure" command from the  To build PCRE on a Unix-like system, first run the "configure" command from the
75  PCRE distribution directory, with your current directory set to the directory  PCRE distribution directory, with your current directory set to the directory
76  where you want the files to be created. This command is a standard GNU  where you want the files to be created. This command is a standard GNU
# Line 83  into /source/pcre/pcre-xxx, but you want Line 94  into /source/pcre/pcre-xxx, but you want
94  cd /build/pcre/pcre-xxx  cd /build/pcre/pcre-xxx
95  /source/pcre/pcre-xxx/configure  /source/pcre/pcre-xxx/configure
96    
97    PCRE is written in C and is normally compiled as a C library. However, it is
98    possible to build it as a C++ library, though the provided building apparatus
99    does not have any features to support this.
100    
101  There are some optional features that can be included or omitted from the PCRE  There are some optional features that can be included or omitted from the PCRE
102  library. You can read more about them in the pcrebuild man page.  library. You can read more about them in the pcrebuild man page.
103    
104    . If you want to suppress the building of the C++ wrapper library, you can add
105      --disable-cpp to the "configure" command. Otherwise, when "configure" is run,
106      will try to find a C++ compiler and C++ header files, and if it succeeds, it
107      will try to build the C++ wrapper.
108    
109  . If you want to make use of the support for UTF-8 character strings in PCRE,  . If you want to make use of the support for UTF-8 character strings in PCRE,
110    you must add --enable-utf8 to the "configure" command. Without it, the code    you must add --enable-utf8 to the "configure" command. Without it, the code
111    for handling UTF-8 is not included in the library. (Even when included, it    for handling UTF-8 is not included in the library. (Even when included, it
# Line 98  library. You can read more about them in Line 118  library. You can read more about them in
118    property table); only the basic two-letter properties such as Lu are    property table); only the basic two-letter properties such as Lu are
119    supported.    supported.
120    
121  . You can build PCRE to recognized CR or NL as the newline character, instead  . You can build PCRE to recognize either CR or LF as the newline character,
122    of whatever your compiler uses for "\n", by adding --newline-is-cr or    instead of whatever your compiler uses for "\n", by adding --newline-is-cr or
123    --newline-is-nl to the "configure" command, respectively. Only do this if you    --newline-is-lf to the "configure" command, respectively. Only do this if you
124    really understand what you are doing. On traditional Unix-like systems, the    really understand what you are doing. On traditional Unix-like systems, the
125    newline character is NL.    newline character is LF.
126    
127  . When called via the POSIX interface, PCRE uses malloc() to get additional  . When called via the POSIX interface, PCRE uses malloc() to get additional
128    storage for processing capturing parentheses if there are more than 10 of    storage for processing capturing parentheses if there are more than 10 of
# Line 112  library. You can read more about them in Line 132  library. You can read more about them in
132    
133    on the "configure" command.    on the "configure" command.
134    
135  . PCRE has a counter which can be set to limit the amount of resources it uses.  . PCRE has a counter that can be set to limit the amount of resources it uses.
136    If the limit is exceeded during a match, the match fails. The default is ten    If the limit is exceeded during a match, the match fails. The default is ten
137    million. You can change the default by setting, for example,    million. You can change the default by setting, for example,
138    
# Line 130  library. You can read more about them in Line 150  library. You can read more about them in
150    is a representation of the compiled pattern, and this changes with the link    is a representation of the compiled pattern, and this changes with the link
151    size.    size.
152    
153  . You can build PCRE so that its match() function does not call itself  . You can build PCRE so that its internal match() function that is called from
154    recursively. Instead, it uses blocks of data from the heap via special    pcre_exec() does not call itself recursively. Instead, it uses blocks of data
155    functions pcre_stack_malloc() and pcre_stack_free() to save data that would    from the heap via special functions pcre_stack_malloc() and pcre_stack_free()
156    otherwise be saved on the stack. To build PCRE like this, use    to save data that would otherwise be saved on the stack. To build PCRE like
157      this, use
158    
159    --disable-stack-for-recursion    --disable-stack-for-recursion
160    
161    on the "configure" command. PCRE runs more slowly in this mode, but it may be    on the "configure" command. PCRE runs more slowly in this mode, but it may be
162    necessary in environments with limited stack sizes.    necessary in environments with limited stack sizes. This applies only to the
163      pcre_exec() function; it does not apply to pcre_dfa_exec(), which does not
164      use deeply nested recursion.
165    
166    The "configure" script builds eight files for the basic C library:
167    
168    . pcre.h is the header file for C programs that call PCRE
169    . Makefile is the makefile that builds the library
170    . config.h contains build-time configuration options for the library
171    . pcre-config is a script that shows the settings of "configure" options
172    . libpcre.pc is data for the pkg-config command
173    . libtool is a script that builds shared and/or static libraries
174    . RunTest is a script for running tests on the library
175    . RunGrepTest is a script for running tests on the pcregrep command
176    
177  The "configure" script builds seven files:  In addition, if a C++ compiler is found, the following are also built:
178    
179  . pcre.h is build by copying pcre.in and making substitutions  . pcrecpp.h is the header file for programs that call PCRE via the C++ wrapper
180  . Makefile is built by copying Makefile.in and making substitutions.  . pcre_stringpiece.h is the header for the C++ "stringpiece" functions
 . config.h is built by copying config.in and making substitutions.  
 . pcre-config is built by copying pcre-config.in and making substitutions.  
 . libpcre.pc is data for the pkg-config command, built from libpcre.pc.in  
 . libtool is a script that builds shared and/or static libraries  
 . RunTest is a script for running tests  
181    
182  Once "configure" has run, you can run "make". It builds two libraries called  The "configure" script also creates config.status, which is an executable
183    script that can be run to recreate the configuration, and config.log, which
184    contains compiler output from tests that "configure" runs.
185    
186    Once "configure" has run, you can run "make". It builds two libraries, called
187  libpcre and libpcreposix, a test program called pcretest, and the pcregrep  libpcre and libpcreposix, a test program called pcretest, and the pcregrep
188  command. You can use "make install" to copy these, the public header files  command. If a C++ compiler was found on your system, it also builds the C++
189  pcre.h and pcreposix.h, and the man pages to appropriate live directories on  wrapper library, which is called libpcrecpp, and some test programs called
190  your system, in the normal way.  pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest.
191    
192    The command "make test" runs all the appropriate tests. Details of the PCRE
193    tests are given in a separate section of this document, below.
194    
195    You can use "make install" to copy the libraries, the public header files
196    pcre.h, pcreposix.h, pcrecpp.h, and pcre_stringpiece.h (the last two only if
197    the C++ wrapper was built), and the man pages to appropriate live directories
198    on your system, in the normal way.
199    
200    If you want to remove PCRE from your system, you can run "make uninstall".
201    This removes all the files that "make install" installed. However, it does not
202    remove any directories, because these are often shared with other programs.
203    
204    
205  Retrieving configuration information on Unix-like systems  Retrieving configuration information on Unix-like systems
# Line 187  pkgconfig. Line 232  pkgconfig.
232  Shared libraries on Unix-like systems  Shared libraries on Unix-like systems
233  -------------------------------------  -------------------------------------
234    
235  The default distribution builds PCRE as two shared libraries and two static  The default distribution builds PCRE as shared libraries and static libraries,
236  libraries, as long as the operating system supports shared libraries. Shared  as long as the operating system supports shared libraries. Shared library
237  library support relies on the "libtool" script which is built as part of the  support relies on the "libtool" script which is built as part of the
238  "configure" process.  "configure" process.
239    
240  The libtool script is used to compile and link both shared and static  The libtool script is used to compile and link both shared and static
# Line 218  order to cross-compile PCRE for some oth Line 263  order to cross-compile PCRE for some oth
263  process, the dftables.c source file is compiled *and run* on the local host, in  process, the dftables.c source file is compiled *and run* on the local host, in
264  order to generate the default character tables (the chartables.c file). It  order to generate the default character tables (the chartables.c file). It
265  therefore needs to be compiled with the local compiler, not the cross compiler.  therefore needs to be compiled with the local compiler, not the cross compiler.
266  You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD)  You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD;
267    there are also CXX_FOR_BUILD and CXXFLAGS_FOR_BUILD for the C++ wrapper)
268  when calling the "configure" command. If they are not specified, they default  when calling the "configure" command. If they are not specified, they default
269  to the values of CC and CFLAGS.  to the values of CC and CFLAGS.
270    
271    
272    Using HP's ANSI C++ compiler (aCC)
273    ----------------------------------
274    
275    Unless C++ support is disabled by specifiying the "--disable-cpp" option of the
276    "configure" script, you *must* include the "-AA" option in the CXXFLAGS
277    environment variable in order for the C++ components to compile correctly.
278    
279    Also, note that the aCC compiler on PA-RISC platforms may have a defect whereby
280    needed libraries fail to get included when specifying the "-AA" compiler
281    option. If you experience unresolved symbols when linking the C++ programs,
282    use the workaround of specifying the following environment variable prior to
283    running the "configure" script:
284    
285      CXXLDFLAGS="-lstd_v2 -lCsup_v2"
286    
287    
288  Building on non-Unix systems  Building on non-Unix systems
289  ----------------------------  ----------------------------
290    
# Line 240  Testing PCRE Line 302  Testing PCRE
302  ------------  ------------
303    
304  To test PCRE on a Unix system, run the RunTest script that is created by the  To test PCRE on a Unix system, run the RunTest script that is created by the
305  configuring process. (This can also be run by "make runtest", "make check", or  configuring process. There is also a script called RunGrepTest that tests the
306  "make test".) For other systems, see the instructions in NON-UNIX-USE.  options of the pcregrep command. If the C++ wrapper library is build, three
307    test programs called pcrecpp_unittest, pcre_scanner_unittest, and
308  The script runs the pcretest test program (which is documented in its own man  pcre_stringpiece_unittest are provided.
309  page) on each of the testinput files (in the testdata directory) in turn,  
310  and compares the output with the contents of the corresponding testoutput file.  Both the scripts and all the program tests are run if you obey "make runtest",
311  A file called testtry is used to hold the main output from pcretest  "make check", or "make test". For other systems, see the instructions in
312    NON-UNIX-USE.
313    
314    The RunTest script runs the pcretest test program (which is documented in its
315    own man page) on each of the testinput files (in the testdata directory) in
316    turn, and compares the output with the contents of the corresponding testoutput
317    file. A file called testtry is used to hold the main output from pcretest
318  (testsavedregex is also used as a working file). To run pcretest on just one of  (testsavedregex is also used as a working file). To run pcretest on just one of
319  the test files, give its number as an argument to RunTest, for example:  the test files, give its number as an argument to RunTest, for example:
320    
# Line 294  commented in the script, can be be used. Line 362  commented in the script, can be be used.
362  The fifth test checks error handling with UTF-8 encoding, and internal UTF-8  The fifth test checks error handling with UTF-8 encoding, and internal UTF-8
363  features of PCRE that are not relevant to Perl.  features of PCRE that are not relevant to Perl.
364    
365  The sixth and final test checks the support for Unicode character properties.  The sixth and test checks the support for Unicode character properties. It it
366  It it not run automatically unless PCRE is built with Unicode property support.  not run automatically unless PCRE is built with Unicode property support. To to
367  To to this you must set --enable-unicode-properties when running "configure".  this you must set --enable-unicode-properties when running "configure".
368    
369    The seventh, eighth, and ninth tests check the pcre_dfa_exec() alternative
370    matching function, in non-UTF-8 mode, UTF-8 mode, and UTF-8 mode with Unicode
371    property support, respectively. The eighth and ninth tests are not run
372    automatically unless PCRE is build with the relevant support.
373    
374    
375  Character tables  Character tables
# Line 348  The distribution should contain the foll Line 421  The distribution should contain the foll
421    
422    dftables.c            auxiliary program for building chartables.c    dftables.c            auxiliary program for building chartables.c
423    
   get.c                 )  
   maketables.c          )  
   study.c               ) source of the functions  
   pcre.c                )   in the library  
424    pcreposix.c           )    pcreposix.c           )
425    printint.c            )    pcre_compile.c        )
426      pcre_config.c         )
427      pcre_dfa_exec.c       )
428      pcre_exec.c           )
429      pcre_fullinfo.c       )
430      pcre_get.c            ) sources for the functions in the library,
431      pcre_globals.c        )   and some internal functions that they use
432      pcre_info.c           )
433      pcre_maketables.c     )
434      pcre_ord2utf8.c       )
435      pcre_printint.c       )
436      pcre_study.c          )
437      pcre_tables.c         )
438      pcre_try_flipped.c    )
439      pcre_ucp_findchar.c   )
440      pcre_valid_utf8.c     )
441      pcre_version.c        )
442      pcre_xclass.c         )
443    
444    ucp.c                 )    ucp_findchar.c        )
445    ucp.h                 ) source for the code that is used for    ucp.h                 ) source for the code that is used for
446    ucpinternal.h         )   Unicode property handling    ucpinternal.h         )   Unicode property handling
447    ucptable.c            )    ucptable.c            )
# Line 364  The distribution should contain the foll Line 450  The distribution should contain the foll
450    pcre.in               "source" for the header for the external API; pcre.h    pcre.in               "source" for the header for the external API; pcre.h
451                            is built from this by "configure"                            is built from this by "configure"
452    pcreposix.h           header for the external POSIX wrapper API    pcreposix.h           header for the external POSIX wrapper API
453    internal.h            header for internal use    pcre_internal.h       header for internal use
454    config.in             template for config.h, which is built by configure    config.in             template for config.h, which is built by configure
455    
456      pcrecpp.h             the header file for the C++ wrapper
457      pcrecpparg.h.in       "source" for another C++ header file
458      pcrecpp.cc            )
459      pcre_scanner.cc       ) source for the C++ wrapper library
460    
461      pcre_stringpiece.h.in "source" for pcre_stringpiece.h, the header for the
462                              C++ stringpiece functions
463      pcre_stringpiece.cc   source for the C++ stringpiece functions
464    
465  (B) Auxiliary files:  (B) Auxiliary files:
466    
467    AUTHORS               information about the author of PCRE    AUTHORS               information about the author of PCRE
# Line 379  The distribution should contain the foll Line 474  The distribution should contain the foll
474    NON-UNIX-USE          notes on building PCRE on non-Unix systems    NON-UNIX-USE          notes on building PCRE on non-Unix systems
475    README                this file    README                this file
476    RunTest.in            template for a Unix shell script for running tests    RunTest.in            template for a Unix shell script for running tests
477      RunGrepTest.in        template for a Unix shell script for pcregrep tests
478    config.guess          ) files used by libtool,    config.guess          ) files used by libtool,
479    config.sub            )   used only when building a shared library    config.sub            )   used only when building a shared library
480    configure             a configuring shell script (built by autoconf)    configure             a configuring shell script (built by autoconf)
# Line 399  The distribution should contain the foll Line 495  The distribution should contain the foll
495    perltest              Perl test program    perltest              Perl test program
496    pcregrep.c            source of a grep utility that uses PCRE    pcregrep.c            source of a grep utility that uses PCRE
497    pcre-config.in        source of script which retains PCRE information    pcre-config.in        source of script which retains PCRE information
498    testdata/testinput1   test data, compatible with Perl    pcrecpp_unittest.c           )
499    testdata/testinput2   test data for error messages and non-Perl things    pcre_scanner_unittest.c      ) test programs for the C++ wrapper
500    testdata/testinput3   test data for locale-specific tests    pcre_stringpiece_unittest.c  )
501    testdata/testinput4   test data for UTF-8 tests compatible with Perl    testdata/testinput*   test data for main library tests
502    testdata/testinput5   test data for other UTF-8 tests    testdata/testoutput*  expected test results
503    testdata/testinput6   test data for Unicode property support tests    testdata/grep*        input and output for pcregrep tests
   testdata/testoutput1  test results corresponding to testinput1  
   testdata/testoutput2  test results corresponding to testinput2  
   testdata/testoutput3  test results corresponding to testinput3  
   testdata/testoutput4  test results corresponding to testinput4  
   testdata/testoutput5  test results corresponding to testinput5  
   testdata/testoutput6  test results corresponding to testinput6  
504    
505  (C) Auxiliary files for Win32 DLL  (C) Auxiliary files for Win32 DLL
506    
   dll.mk  
507    libpcre.def    libpcre.def
508    libpcreposix.def    libpcreposix.def
509    pcre.def    pcre.def
# Line 423  The distribution should contain the foll Line 512  The distribution should contain the foll
512    
513    makevp.bat    makevp.bat
514    
515  Philip Hazel <ph10@cam.ac.uk>  Philip Hazel
516  September 2004  Email local part: ph10
517    Email domain: cam.ac.uk
518    January 2006

Legend:
Removed from v.75  
changed lines
  Added in v.87

  ViewVC Help
Powered by ViewVC 1.1.5