/[pcre]/code/trunk/README
ViewVC logotype

Diff of /code/trunk/README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 139 by ph10, Fri Mar 30 13:41:47 2007 UTC revision 391 by ph10, Tue Mar 17 21:16:01 2009 UTC
# Line 1  Line 1 
1  README file for PCRE (Perl-compatible regular expression library)  README file for PCRE (Perl-compatible regular expression library)
2  -----------------------------------------------------------------  -----------------------------------------------------------------
3    
4  The latest release of PCRE is always available from  The latest release of PCRE is always available in three alternative formats
5    from:
6    
7    ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz    ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
8      ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.bz2
9      ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.zip
10    
11  There is a mailing list for discussion about the development of PCRE at  There is a mailing list for discussion about the development of PCRE at
12    
# Line 103  Building PCRE on non-Unix systems Line 106  Building PCRE on non-Unix systems
106    
107  For a non-Unix system, please read the comments in the file NON-UNIX-USE,  For a non-Unix system, please read the comments in the file NON-UNIX-USE,
108  though if your system supports the use of "configure" and "make" you may be  though if your system supports the use of "configure" and "make" you may be
109  able to build PCRE in the same way as for Unix-like systems.  able to build PCRE in the same way as for Unix-like systems. PCRE can also be
110    configured in many platform environments using the GUI facility of CMake's
111    CMakeSetup. It creates Makefiles, solution files, etc.
112    
113  PCRE has been compiled on many different operating systems. It should be  PCRE has been compiled on many different operating systems. It should be
114  straightforward to build PCRE on any system that has a Standard C compiler and  straightforward to build PCRE on any system that has a Standard C compiler and
# Line 116  Building PCRE on Unix-like systems Line 121  Building PCRE on Unix-like systems
121  If you are using HP's ANSI C++ compiler (aCC), please see the special note  If you are using HP's ANSI C++ compiler (aCC), please see the special note
122  in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.  in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.
123    
124    The following instructions assume the use of the widely used "configure, make,
125    make install" process. There is also support for CMake in the PCRE
126    distribution; there are some comments about using CMake in the NON-UNIX-USE
127    file, though it can also be used in Unix-like systems.
128    
129  To build PCRE on a Unix-like system, first run the "configure" command from the  To build PCRE on a Unix-like system, first run the "configure" command from the
130  PCRE distribution directory, with your current directory set to the directory  PCRE distribution directory, with your current directory set to the directory
131  where you want the files to be created. This command is a standard GNU  where you want the files to be created. This command is a standard GNU
# Line 151  library. You can read more about them in Line 161  library. You can read more about them in
161    it will try to find a C++ compiler and C++ header files, and if it succeeds,    it will try to find a C++ compiler and C++ header files, and if it succeeds,
162    it will try to build the C++ wrapper.    it will try to build the C++ wrapper.
163    
164  . If you want to make use of the support for UTF-8 character strings in PCRE,  . If you want to make use of the support for UTF-8 Unicode character strings in
165    you must add --enable-utf8 to the "configure" command. Without it, the code    PCRE, you must add --enable-utf8 to the "configure" command. Without it, the
166    for handling UTF-8 is not included in the library. (Even when included, it    code for handling UTF-8 is not included in the library. Even when included,
167    still has to be enabled by an option at run time.)    it still has to be enabled by an option at run time. When PCRE is compiled
168      with this option, its input can only either be ASCII or UTF-8, even when
169      running on EBCDIC platforms. It is not possible to use both --enable-utf8 and
170      --enable-ebcdic at the same time.
171    
172  . If, in addition to support for UTF-8 character strings, you want to include  . If, in addition to support for UTF-8 character strings, you want to include
173    support for the \P, \p, and \X sequences that recognize Unicode character    support for the \P, \p, and \X sequences that recognize Unicode character
# Line 164  library. You can read more about them in Line 177  library. You can read more about them in
177    supported.    supported.
178    
179  . You can build PCRE to recognize either CR or LF or the sequence CRLF or any  . You can build PCRE to recognize either CR or LF or the sequence CRLF or any
180    of the Unicode newline sequences as indicating the end of a line. Whatever    of the preceding, or any of the Unicode newline sequences as indicating the
181    you specify at build time is the default; the caller of PCRE can change the    end of a line. Whatever you specify at build time is the default; the caller
182    selection at run time. The default newline indicator is a single LF character    of PCRE can change the selection at run time. The default newline indicator
183    (the Unix standard). You can specify the default newline indicator by adding    is a single LF character (the Unix standard). You can specify the default
184    --newline-is-cr or --newline-is-lf or --newline-is-crlf or --newline-is-any    newline indicator by adding --enable-newline-is-cr or --enable-newline-is-lf
185    to the "configure" command, respectively.    or --enable-newline-is-crlf or --enable-newline-is-anycrlf or
186      --enable-newline-is-any to the "configure" command, respectively.
187    If you specify --newline-is-cr or --newline-is-crlf, some of the standard  
188    tests will fail, because the lines in the test files end with LF. Even if    If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
189    the files are edited to change the line endings, there are likely to be some    the standard tests will fail, because the lines in the test files end with
190    failures. With --newline-is-any, many tests should succeed, but there may be    LF. Even if the files are edited to change the line endings, there are likely
191    some failures.    to be some failures. With --enable-newline-is-anycrlf or
192      --enable-newline-is-any, many tests should succeed, but there may be some
193      failures.
194    
195    . By default, the sequence \R in a pattern matches any Unicode line ending
196      sequence. This is independent of the option specifying what PCRE considers to
197      be the end of a line (see above). However, the caller of PCRE can restrict \R
198      to match only CR, LF, or CRLF. You can make this the default by adding
199      --enable-bsr-anycrlf to the "configure" command (bsr = "backslash R").
200    
201  . When called via the POSIX interface, PCRE uses malloc() to get additional  . When called via the POSIX interface, PCRE uses malloc() to get additional
202    storage for processing capturing parentheses if there are more than 10 of    storage for processing capturing parentheses if there are more than 10 of
# Line 237  library. You can read more about them in Line 258  library. You can read more about them in
258    pcre_chartables.c.dist. See "Character tables" below for further information.    pcre_chartables.c.dist. See "Character tables" below for further information.
259    
260  . It is possible to compile PCRE for use on systems that use EBCDIC as their  . It is possible to compile PCRE for use on systems that use EBCDIC as their
261    default character code (as opposed to ASCII) by specifying    character code (as opposed to ASCII) by specifying
262    
263    --enable-ebcdic    --enable-ebcdic
264    
265    This automatically implies --enable-rebuild-chartables (see above).    This automatically implies --enable-rebuild-chartables (see above). However,
266      when PCRE is built this way, it always operates in EBCDIC. It cannot support
267      both EBCDIC and UTF-8.
268    
269    . It is possible to compile pcregrep to use libz and/or libbz2, in order to
270      read .gz and .bz2 files (respectively), by specifying one or both of
271    
272      --enable-pcregrep-libz
273      --enable-pcregrep-libbz2
274    
275      Of course, the relevant libraries must be installed on your system.
276    
277    . It is possible to compile pcretest so that it links with the libreadline
278      library, by specifying
279    
280      --enable-pcretest-libreadline
281    
282      If this is done, when pcretest's input is from a terminal, it reads it using
283      the readline() function. This provides line-editing and history facilities.
284      Note that libreadline is GPL-licenced, so if you distribute a binary of
285      pcretest linked in this way, there may be licensing issues.
286    
287      Setting this option causes the -lreadline option to be added to the pcretest
288      build. In many operating environments with a sytem-installed readline
289      library this is sufficient. However, in some environments (e.g. if an
290      unmodified distribution version of readline is in use), it may be necessary
291      to specify something like LIBS="-lncurses" as well. This is because, to quote
292      the readline INSTALL, "Readline uses the termcap functions, but does not link
293      with the termcap or curses library itself, allowing applications which link
294      with readline the to choose an appropriate library." If you get error
295      messages about missing functions tgetstr, tgetent, tputs, tgetflag, or tgoto,
296      this is the problem, and linking with the ncurses library should fix it.
297    
298  The "configure" script builds the following files for the basic C library:  The "configure" script builds the following files for the basic C library:
299    
# Line 270  script that can be run to recreate the c Line 322  script that can be run to recreate the c
322  contains compiler output from tests that "configure" runs.  contains compiler output from tests that "configure" runs.
323    
324  Once "configure" has run, you can run "make". It builds two libraries, called  Once "configure" has run, you can run "make". It builds two libraries, called
325  libpcre and libpcreposix, a test program called pcretest, a demonstration  libpcre and libpcreposix, a test program called pcretest, and the pcregrep
326  program called pcredemo, and the pcregrep command. If a C++ compiler was found  command. If a C++ compiler was found on your system, "make" also builds the C++
327  on your system, "make" also builds the C++ wrapper library, which is called  wrapper library, which is called libpcrecpp, and some test programs called
328  libpcrecpp, and some test programs called pcrecpp_unittest,  pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest.
329  pcre_scanner_unittest, and pcre_stringpiece_unittest. Building the C++ wrapper  Building the C++ wrapper can be disabled by adding --disable-cpp to the
330  can be disabled by adding --disable-cpp to the "configure" command.  "configure" command.
331    
332  The command "make check" runs all the appropriate tests. Details of the PCRE  The command "make check" runs all the appropriate tests. Details of the PCRE
333  tests are given below in a separate section of this document.  tests are given below in a separate section of this document.
# Line 327  system. The following are installed (fil Line 379  system. The following are installed (fil
379      pcretest.txt   the pcretest man page      pcretest.txt   the pcretest man page
380      pcregrep.txt   the pcregrep man page      pcregrep.txt   the pcregrep man page
381    
 Note that the pcredemo program that is built by "configure" is *not* installed  
 anywhere. It is a demonstration for programmers wanting to use PCRE.  
   
382  If you want to remove PCRE from your system, you can run "make uninstall".  If you want to remove PCRE from your system, you can run "make uninstall".
383  This removes all the files that "make install" installed. However, it does not  This removes all the files that "make install" installed. However, it does not
384  remove any directories, because these are often shared with other programs.  remove any directories, because these are often shared with other programs.
# Line 429  Making new tarballs Line 478  Making new tarballs
478  -------------------  -------------------
479    
480  The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and  The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and
481  zip formats. However, if you have modified any of the man page sources in the  zip formats. The command "make distcheck" does the same, but then does a trial
482  doc directory, you should first run the PrepareRelease script. This re-creates  build of the new distribution to ensure that it works.
483  the .txt and HTML forms of the documentation from the man pages.  
484    If you have modified any of the man page sources in the doc directory, you
485    should first run the PrepareRelease script before making a distribution. This
486    script creates the .txt and HTML forms of the documentation from the man pages.
487    
488    
489  Testing PCRE  Testing PCRE
# Line 489  is output to say why. If running this te Line 541  is output to say why. If running this te
541  in the comparison output, it means that locale is not available on your system,  in the comparison output, it means that locale is not available on your system,
542  despite being listed by "locale". This does not mean that PCRE is broken.  despite being listed by "locale". This does not mean that PCRE is broken.
543    
544  [If you are trying to run this test on Windows, you may be able to get it to  [If you are trying to run this test on Windows, you may be able to get it to
545  work by changing "fr_FR" to "french" everywhere it occurs.]  work by changing "fr_FR" to "french" everywhere it occurs. Alternatively, use
546    RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses
547    Windows versions of test 2. More info on using RunTest.bat is included in the
548    document entitled NON-UNIX-USE.]
549    
550  The fourth test checks the UTF-8 support. It is not run automatically unless  The fourth test checks the UTF-8 support. It is not run automatically unless
551  PCRE is built with UTF-8 support. To do this you must set --enable-utf8 when  PCRE is built with UTF-8 support. To do this you must set --enable-utf8 when
# Line 595  The distribution should contain the foll Line 650  The distribution should contain the foll
650    pcre_study.c            )    pcre_study.c            )
651    pcre_tables.c           )    pcre_tables.c           )
652    pcre_try_flipped.c      )    pcre_try_flipped.c      )
653    pcre_ucp_searchfuncs.c  )    pcre_ucd.c              )
654    pcre_valid_utf8.c       )    pcre_valid_utf8.c       )
655    pcre_version.c          )    pcre_version.c          )
656    pcre_xclass.c           )    pcre_xclass.c           )
# Line 604  The distribution should contain the foll Line 659  The distribution should contain the foll
659    pcre.h.in               template for pcre.h when built by "configure"    pcre.h.in               template for pcre.h when built by "configure"
660    pcreposix.h             header for the external POSIX wrapper API    pcreposix.h             header for the external POSIX wrapper API
661    pcre_internal.h         header for internal use    pcre_internal.h         header for internal use
662    ucp.h                   ) headers concerned with    ucp.h                   header for Unicode property handling
   ucpinternal.h           )   Unicode property handling  
   ucptable.h              ) (this one is the data table)  
663    
664    config.h.in             template for config.h, which is built by "configure"    config.h.in             template for config.h, which is built by "configure"
665    
# Line 680  The distribution should contain the foll Line 733  The distribution should contain the foll
733    
734  (D) Auxiliary files for cmake support  (D) Auxiliary files for cmake support
735    
736      cmake/COPYING-CMAKE-SCRIPTS
737      cmake/FindPackageHandleStandardArgs.cmake
738      cmake/FindReadline.cmake
739    CMakeLists.txt    CMakeLists.txt
740    config-cmake.h.in    config-cmake.h.in
741    
# Line 704  The distribution should contain the foll Line 760  The distribution should contain the foll
760  Philip Hazel  Philip Hazel
761  Email local part: ph10  Email local part: ph10
762  Email domain: cam.ac.uk  Email domain: cam.ac.uk
763  Last updated: 29 March 2007  Last updated: 17 March 2009

Legend:
Removed from v.139  
changed lines
  Added in v.391

  ViewVC Help
Powered by ViewVC 1.1.5