/[pcre]/code/trunk/README
ViewVC logotype

Diff of /code/trunk/README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 77 by nigel, Sat Feb 24 21:40:45 2007 UTC revision 99 by ph10, Tue Mar 6 12:27:42 2007 UTC
# Line 34  Documentation for PCRE Line 34  Documentation for PCRE
34  ----------------------  ----------------------
35    
36  If you install PCRE in the normal way, you will end up with an installed set of  If you install PCRE in the normal way, you will end up with an installed set of
37  man pages whose names all start with "pcre". The one that is called "pcre"  man pages whose names all start with "pcre". The one that is just called "pcre"
38  lists all the others. In addition to these man pages, the PCRE documentation is  lists all the others. In addition to these man pages, the PCRE documentation is
39  supplied in two other forms; however, as there is no standard place to install  supplied in two other forms; however, as there is no standard place to install
40  them, they are left in the doc directory of the unpacked source distribution.  them, they are left in the doc directory of the unpacked source distribution.
# Line 68  others are pointers to URLs containing r Line 68  others are pointers to URLs containing r
68  Building PCRE on a Unix-like system  Building PCRE on a Unix-like system
69  -----------------------------------  -----------------------------------
70    
71    If you are using HP's ANSI C++ compiler (aCC), please see the special note
72    in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.
73    
74  To build PCRE on a Unix-like system, first run the "configure" command from the  To build PCRE on a Unix-like system, first run the "configure" command from the
75  PCRE distribution directory, with your current directory set to the directory  PCRE distribution directory, with your current directory set to the directory
76  where you want the files to be created. This command is a standard GNU  where you want the files to be created. This command is a standard GNU
# Line 91  into /source/pcre/pcre-xxx, but you want Line 94  into /source/pcre/pcre-xxx, but you want
94  cd /build/pcre/pcre-xxx  cd /build/pcre/pcre-xxx
95  /source/pcre/pcre-xxx/configure  /source/pcre/pcre-xxx/configure
96    
97    PCRE is written in C and is normally compiled as a C library. However, it is
98    possible to build it as a C++ library, though the provided building apparatus
99    does not have any features to support this.
100    
101  There are some optional features that can be included or omitted from the PCRE  There are some optional features that can be included or omitted from the PCRE
102  library. You can read more about them in the pcrebuild man page.  library. You can read more about them in the pcrebuild man page.
103    
104    . If you want to suppress the building of the C++ wrapper library, you can add
105      --disable-cpp to the "configure" command. Otherwise, when "configure" is run,
106      will try to find a C++ compiler and C++ header files, and if it succeeds, it
107      will try to build the C++ wrapper.
108    
109  . If you want to make use of the support for UTF-8 character strings in PCRE,  . If you want to make use of the support for UTF-8 character strings in PCRE,
110    you must add --enable-utf8 to the "configure" command. Without it, the code    you must add --enable-utf8 to the "configure" command. Without it, the code
111    for handling UTF-8 is not included in the library. (Even when included, it    for handling UTF-8 is not included in the library. (Even when included, it
# Line 102  library. You can read more about them in Line 114  library. You can read more about them in
114  . If, in addition to support for UTF-8 character strings, you want to include  . If, in addition to support for UTF-8 character strings, you want to include
115    support for the \P, \p, and \X sequences that recognize Unicode character    support for the \P, \p, and \X sequences that recognize Unicode character
116    properties, you must add --enable-unicode-properties to the "configure"    properties, you must add --enable-unicode-properties to the "configure"
117    command. This adds about 90K to the size of the library (in the form of a    command. This adds about 30K to the size of the library (in the form of a
118    property table); only the basic two-letter properties such as Lu are    property table); only the basic two-letter properties such as Lu are
119    supported.    supported.
120    
121  . You can build PCRE to recognized CR or NL as the newline character, instead  . You can build PCRE to recognize either CR or LF or the sequence CRLF or any
122    of whatever your compiler uses for "\n", by adding --newline-is-cr or    of the Unicode newline sequences as indicating the end of a line. Whatever
123    --newline-is-nl to the "configure" command, respectively. Only do this if you    you specify at build time is the default; the caller of PCRE can change the
124    really understand what you are doing. On traditional Unix-like systems, the    selection at run time. The default newline indicator is a single LF character
125    newline character is NL.    (the Unix standard). You can specify the default newline indicator by adding
126      --newline-is-cr or --newline-is-lf or --newline-is-crlf or --newline-is-any
127      to the "configure" command, respectively.
128    
129      If you specify --newline-is-cr or --newline-is-crlf, some of the standard
130      tests will fail, because the lines in the test files end with LF. Even if
131      the files are edited to change the line endings, there are likely to be some
132      failures. With --newline-is-any, many tests should succeed, but there may be
133      some failures.
134    
135  . When called via the POSIX interface, PCRE uses malloc() to get additional  . When called via the POSIX interface, PCRE uses malloc() to get additional
136    storage for processing capturing parentheses if there are more than 10 of    storage for processing capturing parentheses if there are more than 10 of
# Line 130  library. You can read more about them in Line 150  library. You can read more about them in
150    pcre_exec() can supply their own value. There is discussion on the pcreapi    pcre_exec() can supply their own value. There is discussion on the pcreapi
151    man page.    man page.
152    
153    . There is a separate counter that limits the depth of recursive function calls
154      during a matching process. This also has a default of ten million, which is
155      essentially "unlimited". You can change the default by setting, for example,
156    
157      --with-match-limit-recursion=500000
158    
159      Recursive function calls use up the runtime stack; running out of stack can
160      cause programs to crash in strange ways. There is a discussion about stack
161      sizes in the pcrestack man page.
162    
163  . The default maximum compiled pattern size is around 64K. You can increase  . The default maximum compiled pattern size is around 64K. You can increase
164    this by adding --with-link-size=3 to the "configure" command. You can    this by adding --with-link-size=3 to the "configure" command. You can
165    increase it even more by setting --with-link-size=4, but this is unlikely    increase it even more by setting --with-link-size=4, but this is unlikely
# Line 153  library. You can read more about them in Line 183  library. You can read more about them in
183    
184  The "configure" script builds eight files for the basic C library:  The "configure" script builds eight files for the basic C library:
185    
 . pcre.h is the header file for C programs that call PCRE  
186  . Makefile is the makefile that builds the library  . Makefile is the makefile that builds the library
187  . config.h contains build-time configuration options for the library  . config.h contains build-time configuration options for the library
188  . pcre-config is a script that shows the settings of "configure" options  . pcre-config is a script that shows the settings of "configure" options
# Line 257  when calling the "configure" command. If Line 286  when calling the "configure" command. If
286  to the values of CC and CFLAGS.  to the values of CC and CFLAGS.
287    
288    
289    Using HP's ANSI C++ compiler (aCC)
290    ----------------------------------
291    
292    Unless C++ support is disabled by specifying the "--disable-cpp" option of the
293    "configure" script, you *must* include the "-AA" option in the CXXFLAGS
294    environment variable in order for the C++ components to compile correctly.
295    
296    Also, note that the aCC compiler on PA-RISC platforms may have a defect whereby
297    needed libraries fail to get included when specifying the "-AA" compiler
298    option. If you experience unresolved symbols when linking the C++ programs,
299    use the workaround of specifying the following environment variable prior to
300    running the "configure" script:
301    
302      CXXLDFLAGS="-lstd_v2 -lCsup_v2"
303    
304    
305  Building on non-Unix systems  Building on non-Unix systems
306  ----------------------------  ----------------------------
307    
# Line 266  PCRE in the same way as for Unix systems Line 311  PCRE in the same way as for Unix systems
311    
312  PCRE has been compiled on Windows systems and on Macintoshes, but I don't know  PCRE has been compiled on Windows systems and on Macintoshes, but I don't know
313  the details because I don't use those systems. It should be straightforward to  the details because I don't use those systems. It should be straightforward to
314  build PCRE on any system that has a Standard C compiler, because it uses only  build PCRE on any system that has a Standard C compiler and library, because it
315  Standard C functions.  uses only Standard C functions.
316    
317    
318  Testing PCRE  Testing PCRE
# Line 286  NON-UNIX-USE. Line 331  NON-UNIX-USE.
331  The RunTest script runs the pcretest test program (which is documented in its  The RunTest script runs the pcretest test program (which is documented in its
332  own man page) on each of the testinput files (in the testdata directory) in  own man page) on each of the testinput files (in the testdata directory) in
333  turn, and compares the output with the contents of the corresponding testoutput  turn, and compares the output with the contents of the corresponding testoutput
334  file. A file called testtry is used to hold the main output from pcretest  files. A file called testtry is used to hold the main output from pcretest
335  (testsavedregex is also used as a working file). To run pcretest on just one of  (testsavedregex is also used as a working file). To run pcretest on just one of
336  the test files, give its number as an argument to RunTest, for example:  the test files, give its number as an argument to RunTest, for example:
337    
338    RunTest 2    RunTest 2
339    
340  The first file can also be fed directly into the perltest script to check that  The first test file can also be fed directly into the perltest script to check
341  Perl gives the same results. The only difference you should see is in the first  that Perl gives the same results. The only difference you should see is in the
342  few lines, where the Perl version is given instead of the PCRE version.  first few lines, where the Perl version is given instead of the PCRE version.
343    
344  The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(),  The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(),
345  pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error  pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error
# Line 403  The distribution should contain the foll Line 448  The distribution should contain the foll
448    pcre_globals.c        )   and some internal functions that they use    pcre_globals.c        )   and some internal functions that they use
449    pcre_info.c           )    pcre_info.c           )
450    pcre_maketables.c     )    pcre_maketables.c     )
451      pcre_newline.c        )
452    pcre_ord2utf8.c       )    pcre_ord2utf8.c       )
453    pcre_printint.c       )    pcre_refcount.c       )
454    pcre_study.c          )    pcre_study.c          )
455    pcre_tables.c         )    pcre_tables.c         )
456    pcre_try_flipped.c    )    pcre_try_flipped.c    )
457    pcre_ucp_findchar.c   )    pcre_ucp_searchfuncs.c)
458    pcre_valid_utf8.c     )    pcre_valid_utf8.c     )
459    pcre_version.c        )    pcre_version.c        )
460    pcre_xclass.c         )    pcre_xclass.c         )
461    
462    ucp_findchar.c        )    pcre_printint.src     ) debugging function that is #included in pcretest, and
463    ucp.h                 ) source for the code that is used for                          )   can also be #included in pcre_compile()
   ucpinternal.h         )   Unicode property handling  
   ucptable.c            )  
   ucptypetable.c        )  
464    
465    pcre.in               "source" for the header for the external API; pcre.h    pcre.h                the public PCRE header file
                           is built from this by "configure"  
466    pcreposix.h           header for the external POSIX wrapper API    pcreposix.h           header for the external POSIX wrapper API
467    pcre_internal.h       header for internal use    pcre_internal.h       header for internal use
468      ucp.h                 ) headers concerned with
469      ucpinternal.h         )   Unicode property handling
470      ucptable.h            ) (this one is the data table)
471    config.in             template for config.h, which is built by configure    config.in             template for config.h, which is built by configure
472    
473    pcrecpp.h.in          "source" for the header file for the C++ wrapper    pcrecpp.h             the header file for the C++ wrapper
474      pcrecpparg.h.in       "source" for another C++ header file
475    pcrecpp.cc            )    pcrecpp.cc            )
476    pcre_scanner.cc       ) source for the C++ wrapper library    pcre_scanner.cc       ) source for the C++ wrapper library
477    
# Line 448  The distribution should contain the foll Line 494  The distribution should contain the foll
494    RunGrepTest.in        template for a Unix shell script for pcregrep tests    RunGrepTest.in        template for a Unix shell script for pcregrep tests
495    config.guess          ) files used by libtool,    config.guess          ) files used by libtool,
496    config.sub            )   used only when building a shared library    config.sub            )   used only when building a shared library
497      config.h.in           "source" for the config.h header file
498    configure             a configuring shell script (built by autoconf)    configure             a configuring shell script (built by autoconf)
499    configure.in          the autoconf input used to build configure    configure.ac          the autoconf input used to build configure
500    doc/Tech.Notes        notes on the encoding    doc/Tech.Notes        notes on the encoding
501    doc/*.3               man page sources for the PCRE functions    doc/*.3               man page sources for the PCRE functions
502    doc/*.1               man page sources for pcregrep and pcretest    doc/*.1               man page sources for pcregrep and pcretest
# Line 463  The distribution should contain the foll Line 510  The distribution should contain the foll
510    mkinstalldirs         script for making install directories    mkinstalldirs         script for making install directories
511    pcretest.c            comprehensive test program    pcretest.c            comprehensive test program
512    pcredemo.c            simple demonstration of coding calls to PCRE    pcredemo.c            simple demonstration of coding calls to PCRE
513    perltest              Perl test program    perltest.pl           Perl test program
514    pcregrep.c            source of a grep utility that uses PCRE    pcregrep.c            source of a grep utility that uses PCRE
515    pcre-config.in        source of script which retains PCRE information    pcre-config.in        source of script which retains PCRE information
516    pcrecpp_unittest.c           )    pcrecpp_unittest.c           )
# Line 477  The distribution should contain the foll Line 524  The distribution should contain the foll
524    
525    libpcre.def    libpcre.def
526    libpcreposix.def    libpcreposix.def
   pcre.def  
527    
528  (D) Auxiliary file for VPASCAL  (D) Auxiliary file for VPASCAL
529    
# Line 486  The distribution should contain the foll Line 532  The distribution should contain the foll
532  Philip Hazel  Philip Hazel
533  Email local part: ph10  Email local part: ph10
534  Email domain: cam.ac.uk  Email domain: cam.ac.uk
535  June 2005  March 2007

Legend:
Removed from v.77  
changed lines
  Added in v.99

  ViewVC Help
Powered by ViewVC 1.1.5