--- code/trunk/doc/html/pcre.html 2007/02/24 21:40:37 75 +++ code/trunk/doc/html/pcre.html 2007/02/24 21:40:45 77 @@ -23,16 +23,27 @@
The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl, with just a few -differences. The current implementation of PCRE (release 5.x) corresponds +differences. The current implementation of PCRE (release 6.x) corresponds approximately with Perl 5.8, including support for UTF-8 encoded strings and Unicode general category properties. However, this support has to be explicitly enabled; it is not the default.
+In addition to the Perl-compatible matching function, PCRE also contains an +alternative matching function that matches the same compiled patterns in a +different way. In certain circumstances, the alternative function has some +advantages. For a discussion of the two matching algorithms, see the +pcrematching +page. ++
PCRE is written in C and released as a C library. A number of people have -written wrappers and interfaces of various kinds. A C++ class is included in -these contributions, which can be found in the Contrib directory at the -primary FTP site, which is: +written wrappers and interfaces of various kinds. In particular, Google Inc. +have provided a comprehensive C++ wrapper. This is now included as part of the +PCRE distribution. The +pcrecpp +page has details of this interface. Other people's contributions can be found +in the Contrib directory at the primary FTP site, which is: ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre
@@ -53,6 +64,12 @@ page. Documentation about building PCRE for various operating systems can be found in the README file in the source distribution.+
+The library contains a number of undocumented internal functions and data +tables that are used by more than one of the exported external functions, but +which are not intended for use by external callers. Their names all begin with +"_pcre_", which hopefully will not provoke any name clashes. +
The user documentation for PCRE comprises a number of different sections. In @@ -62,21 +79,23 @@ follows:
pcre this document - pcreapi details of PCRE's native API + pcreapi details of PCRE's native C API pcrebuild options for building PCRE pcrecallout details of the callout feature pcrecompat discussion of Perl compatibility + pcrecpp details of the C++ wrapper pcregrep description of the pcregrep command + pcrematching discussion of the two matching algorithms pcrepartial details of the partial matching facility pcrepattern syntax and semantics of supported regular expressions pcreperform discussion of performance issues - pcreposix the POSIX-compatible API + pcreposix the POSIX-compatible C API pcreprecompile details of saving and re-using precompiled patterns pcresample discussion of the sample program pcretest description of the pcretest testing commandIn addition, in the "man" and HTML formats, there is a short page for each -library function, listing its arguments and results. +C library function, listing its arguments and results.
@@ -104,9 +123,10 @@
The maximum length of a subject string is the largest positive number that an -integer variable can hold. However, PCRE uses recursion to handle subpatterns -and indefinite repetition. This means that the available stack space may limit -the size of a subject string that can be processed by certain patterns. +integer variable can hold. However, when using the traditional matching +function, PCRE uses recursion to handle subpatterns and indefinite repetition. +This means that the available stack space may limit the size of a subject +string that can be processed by certain patterns.
@@ -174,7 +194,8 @@
6. The escape sequence \C can be used to match a single byte in UTF-8 mode, -but its use can lead to some strange effects. +but its use can lead to some strange effects. This facility is not available in +the alternative matching function, pcre_dfa_exec().
7. The character escapes \b, \B, \d, \D, \s, \S, \w, and \W correctly @@ -199,16 +220,19 @@
-Philip Hazel <email@example.com>
University Computing Service,
Cambridge CB2 3QG, England. +
+Putting an actual email address here seems to have been a spam magnet, so I've
+taken it away. If you want to email me, use my initial and surname, separated
+by a dot, at the domain ucs.cam.ac.uk.
+Last updated: 07 March 2005
-Phone: +44 1223 334714 -Last updated: 09 September 2004 -
-Copyright © 1997-2004 University of Cambridge. +Copyright © 1997-2005 University of Cambridge.
Return to the PCRE index page.