/[pcre]/code/trunk/README
ViewVC logotype

Diff of /code/trunk/README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 840 by ph10, Fri Dec 30 19:32:50 2011 UTC revision 842 by ph10, Sat Dec 31 15:19:04 2011 UTC
# Line 34  The contents of this README file are: Line 34  The contents of this README file are:
34  The PCRE APIs  The PCRE APIs
35  -------------  -------------
36    
37  PCRE is written in C, and it has its own API. There are two sets of functions,  PCRE is written in C, and it has its own API. There are two sets of functions,
38  one for the 8-bit library, which processes strings of bytes, and one for the  one for the 8-bit library, which processes strings of bytes, and one for the
39  16-bit library, which processes strings of 16-bit values. The distribution also  16-bit library, which processes strings of 16-bit values. The distribution also
40  includes a set of C++ wrapper functions (see the pcrecpp man page for details),  includes a set of C++ wrapper functions (see the pcrecpp man page for details),
41  courtesy of Google Inc., which can be used to call the 8-bit PCRE library from  courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
42  C++.  C++.
43    
44  In addition, there is a set of C wrapper functions (again, just for the 8-bit  In addition, there is a set of C wrapper functions (again, just for the 8-bit
45  library) that are based on the POSIX regular expression API (see the pcreposix  library) that are based on the POSIX regular expression API (see the pcreposix
46  man page). These end up in the library called libpcreposix. Note that this just  man page). These end up in the library called libpcreposix. Note that this just
47  provides a POSIX calling interface to PCRE; the regular expressions themselves  provides a POSIX calling interface to PCRE; the regular expressions themselves
# Line 171  library. They are also documented in the Line 171  library. They are also documented in the
171    --disable-static    --disable-static
172    
173    (See also "Shared libraries on Unix-like systems" below.)    (See also "Shared libraries on Unix-like systems" below.)
174    
175  . By default, only the 8-bit library is built. If you add --enable-pcre16 to  . By default, only the 8-bit library is built. If you add --enable-pcre16 to
176    the "configure" command, the 16-bit library is also built. If you want only    the "configure" command, the 16-bit library is also built. If you want only
177    the 16-bit library, use "./configure --enable-pcre16 --disable-pcre8".    the 16-bit library, use "./configure --enable-pcre16 --disable-pcre8".
178    
179  . If you are building the 8-bit library and want to suppress the building of  . If you are building the 8-bit library and want to suppress the building of
180    the C++ wrapper library, you can add --disable-cpp to the "configure"    the C++ wrapper library, you can add --disable-cpp to the "configure"
181    command. Otherwise, when "configure" is run without --disable-pcre8, it will    command. Otherwise, when "configure" is run without --disable-pcre8, it will
182    try to find a C++ compiler and C++ header files, and if it succeeds, it will    try to find a C++ compiler and C++ header files, and if it succeeds, it will
# Line 200  library. They are also documented in the Line 200  library. They are also documented in the
200    can only either be ASCII or UTF-8/16, even when running on EBCDIC platforms.    can only either be ASCII or UTF-8/16, even when running on EBCDIC platforms.
201    It is not possible to use both --enable-utf and --enable-ebcdic at the same    It is not possible to use both --enable-utf and --enable-ebcdic at the same
202    time.    time.
   
 . The option --enable-utf8 is retained for backwards compatibility with earlier  
   releases that did not support 16-bit character strings. It is synonymous with  
   --enable-utf. It is not possible to configure one library with UTF support  
   and the other without in the same configuration.  
203    
204  . If, in addition to support for UTF-8/16 character strings, you want to  . The option --enable-utf8 is retained for backwards compatibility with earlier
205      releases that did not support 16-bit character strings. It is synonymous with
206      --enable-utf. It is not possible to configure one library with UTF support
207      and the other without in the same configuration.
208    
209    . If, in addition to support for UTF-8/16 character strings, you want to
210    include support for the \P, \p, and \X sequences that recognize Unicode    include support for the \P, \p, and \X sequences that recognize Unicode
211    character properties, you must add --enable-unicode-properties to the    character properties, you must add --enable-unicode-properties to the
212    "configure" command. This adds about 30K to the size of the library (in the    "configure" command. This adds about 30K to the size of the library (in the
# Line 264  library. They are also documented in the Line 264  library. They are also documented in the
264    sizes in the pcrestack man page.    sizes in the pcrestack man page.
265    
266  . The default maximum compiled pattern size is around 64K. You can increase  . The default maximum compiled pattern size is around 64K. You can increase
267    this by adding --with-link-size=3 to the "configure" command. In the 8-bit    this by adding --with-link-size=3 to the "configure" command. In the 8-bit
268    library, PCRE then uses three bytes instead of two for offsets to different    library, PCRE then uses three bytes instead of two for offsets to different
269    parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is    parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
270    the same as --with-link-size=4, which (in both libraries) uses four-byte    the same as --with-link-size=4, which (in both libraries) uses four-byte
271    offsets. Increasing the internal link size reduces performance.    offsets. Increasing the internal link size reduces performance.
272    
273  . You can build PCRE so that its internal match() function that is called from  . You can build PCRE so that its internal match() function that is called from
# Line 305  library. They are also documented in the Line 305  library. They are also documented in the
305    when PCRE is built this way, it always operates in EBCDIC. It cannot support    when PCRE is built this way, it always operates in EBCDIC. It cannot support
306    both EBCDIC and UTF-8/16.    both EBCDIC and UTF-8/16.
307    
308  . The pcregrep program currently supports only 8-bit data files, and so  . The pcregrep program currently supports only 8-bit data files, and so
309    requires the 8-bit PCRE library. It is possible to compile pcregrep to use    requires the 8-bit PCRE library. It is possible to compile pcregrep to use
310    libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by    libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by
311    specifying one or both of    specifying one or both of
# Line 397  system. The following are installed (fil Line 397  system. The following are installed (fil
397      pcre-config      pcre-config
398    
399    Libraries (lib):    Libraries (lib):
400      libpcre16     (if 16-bit support is enabled)      libpcre16     (if 16-bit support is enabled)
401      libpcre       (if 8-bit support is enabled)      libpcre       (if 8-bit support is enabled)
402      libpcreposix  (if 8-bit support is enabled)      libpcreposix  (if 8-bit support is enabled)
403      libpcrecpp    (if 8-bit and C++ support is enabled)      libpcrecpp    (if 8-bit and C++ support is enabled)
404    
405    Configuration information (lib/pkgconfig):    Configuration information (lib/pkgconfig):
406      libpcre16.pc      libpcre16.pc
407      libpcre.pc      libpcre.pc
408      libpcreposix.pc      libpcreposix.pc
409      libpcrecpp.pc (if C++ support is enabled)      libpcrecpp.pc (if C++ support is enabled)
# Line 592  tests that are marked "never study" (see Line 592  tests that are marked "never study" (see
592  done). If JIT support is available, the non-DFA tests are run a third time,  done). If JIT support is available, the non-DFA tests are run a third time,
593  this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.  this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.
594    
595  When both 8-bit and 16-bit support is enabled, the entire set of tests is run  When both 8-bit and 16-bit support is enabled, the entire set of tests is run
596  twice, once for each library. If you want to run just one set of tests, call  twice, once for each library. If you want to run just one set of tests, call
597  RunTest with either the -8 or -16 option.  RunTest with either the -8 or -16 option.
598    
599  RunTest uses a file called testtry to hold the main output from pcretest  RunTest uses a file called testtry to hold the main output from pcretest.
600  (testsavedregex is also used as a working file). To run pcretest on just one or  Other files whose names begin with "test" are used as working files in some
601  more specific test files, give their numbers as arguments to RunTest, for  tests. To run pcretest on just one or more specific test files, give their
602  example:  numbers as arguments to RunTest, for example:
603    
604    RunTest 2 7 11    RunTest 2 7 11
605    
606  The first test file can be fed directly into the perltest.pl script to check  The first test file can be fed directly into the perltest.pl script to check
607  that Perl gives the same results. The only difference you should see is in the  that Perl gives the same results. The only difference you should see is in the
608  first few lines, where the Perl version is given instead of the PCRE version.  first few lines, where the Perl version is given instead of the PCRE version.
# Line 658  The twelfth test is run only when JIT su Line 658  The twelfth test is run only when JIT su
658  test is run only when JIT support is not available. They test some JIT-specific  test is run only when JIT support is not available. They test some JIT-specific
659  features such as information output from pcretest about JIT compilation.  features such as information output from pcretest about JIT compilation.
660    
661  The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and  The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
662  the seventeenth, eighteenth, and nineteenth tests are run only in 16-bit mode.  the seventeenth, eighteenth, and nineteenth tests are run only in 16-bit mode.
663  These are tests that generate different output in the two modes. They are for  These are tests that generate different output in the two modes. They are for
664  general cases, UTF-8/16 support, and Unicode property support, respectively.  general cases, UTF-8/16 support, and Unicode property support, respectively.
665    
666  The twentieth test is run only in 16-bit mode. It tests some specific 16-bit  The twentieth test is run only in 16-bit mode. It tests some specific 16-bit
667  features of the DFA matching engine.  features of the DFA matching engine.
668    
669    
# Line 724  will cause PCRE to malfunction. Line 724  will cause PCRE to malfunction.
724  File manifest  File manifest
725  -------------  -------------
726    
727  The distribution should contain the files listed below. Where a file name is  The distribution should contain the files listed below. Where a file name is
728  given as pcre[16]_xxx it means that there are two files, one with the name  given as pcre[16]_xxx it means that there are two files, one with the name
729  pcre_xxx and the other with the name pcre16_xxx.  pcre_xxx and the other with the name pcre16_xxx.
730    
731  (A) Source files of the PCRE library functions and their headers:  (A) Source files of the PCRE library functions and their headers:
# Line 761  pcre_xxx and the other with the name pcr Line 761  pcre_xxx and the other with the name pcr
761    pcre16_ord2utf16.c      )    pcre16_ord2utf16.c      )
762    pcre16_utf16_utils.c    )    pcre16_utf16_utils.c    )
763    pcre16_valid_utf16.c    )    pcre16_valid_utf16.c    )
764    
765    pcre[16]_printint.c     ) debugging function that is used by pcretest,    pcre[16]_printint.c     ) debugging function that is used by pcretest,
766                            )   and can also be #included in pcre_compile()                            )   and can also be #included in pcre_compile()
767    
768    pcre.h.in               template for pcre.h when built by "configure"    pcre.h.in               template for pcre.h when built by "configure"
769    pcreposix.h             header for the external POSIX wrapper API    pcreposix.h             header for the external POSIX wrapper API
770    pcre_internal.h         header for internal use    pcre_internal.h         header for internal use
# Line 843  pcre_xxx and the other with the name pcr Line 843  pcre_xxx and the other with the name pcr
843    testdata/testinput*     test data for main library tests    testdata/testinput*     test data for main library tests
844    testdata/testoutput*    expected test results    testdata/testoutput*    expected test results
845    testdata/grep*          input and output for pcregrep tests    testdata/grep*          input and output for pcregrep tests
846    testdata/*              other supporting test files    testdata/*              other supporting test files
847    
848  (D) Auxiliary files for cmake support  (D) Auxiliary files for cmake support
849    

Legend:
Removed from v.840  
changed lines
  Added in v.842

  ViewVC Help
Powered by ViewVC 1.1.5