--- code/trunk/doc/html/pcrecompat.html 2007/02/24 21:40:37 75 +++ code/trunk/doc/html/pcrecompat.html 2007/02/24 21:41:42 93 @@ -17,12 +17,13 @@

This document describes the differences in the ways that PCRE and Perl handle -regular expressions. The differences described here are with respect to Perl -5.8. +regular expressions. The differences described here are mainly with respect to +Perl 5.8, though PCRE version 7.0 contains some features that are expected to +be in the forthcoming Perl 5.10.

-1. PCRE does not have full UTF-8 support. Details of what it does have are -given in the +1. PCRE has only a subset of Perl's UTF-8 and Unicode support. Details of what +it does have are given in the section on UTF-8 support in the main pcre @@ -57,7 +58,8 @@ 6. The Perl escape sequences \p, \P, and \X are supported only if PCRE is built with Unicode character property support. The properties that can be tested with \p and \P are limited to the general category properties such as -Lu and Nd. +Lu and Nd, script names such as Greek or Han, and the derived properties Any +and L&.

7. PCRE does support the \Q...\E escape for quoting substrings. Characters in @@ -75,20 +77,28 @@ The \Q...\E sequence is recognized both inside and outside character classes.

-8. Fairly obviously, PCRE does not support the (?{code}) and (?p{code}) -constructions. However, there is support for recursive patterns using the -non-Perl items (?R), (?number), and (?P>name). Also, the PCRE "callout" feature -allows an external function to be called during pattern matching. See the +8. Fairly obviously, PCRE does not support the (?{code}) and (??{code}) +constructions. However, there is support for recursive patterns. This is not +available in Perl 5.8, but will be in Perl 5.10. Also, the PCRE "callout" +feature allows an external function to be called during pattern matching. See +the pcrecallout documentation for details.

-9. There are some differences that are concerned with the settings of captured +9. Subpatterns that are called recursively or as "subroutines" are always +treated as atomic groups in PCRE. This is like Python, but unlike Perl. +

+

+10. There are some differences that are concerned with the settings of captured strings when part of a pattern is repeated. For example, matching "aba" against the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b".

-10. PCRE provides some extensions to the Perl regular expression facilities: +11. PCRE provides some extensions to the Perl regular expression facilities. +Perl 5.10 will include new features that are not in earlier versions, some of +which (such as named parentheses) have been in PCRE for some time. This list is +with respect to Perl 5.10:

(a) Although lookbehind assertions must match fixed length strings, each @@ -101,7 +111,8 @@

(c) If PCRE_EXTRA is set, a backslash followed by a letter with no special -meaning is faulted. +meaning is faulted. Otherwise, like Perl, the backslash is ignored. (Perl can +be made to issue a warning.)

(d) If PCRE_UNGREEDY is set, the greediness of the repetition quantifiers is @@ -117,34 +128,23 @@ options for pcre_exec() have no Perl equivalents.

-(g) The (?R), (?number), and (?P>name) constructs allows for recursive pattern -matching (Perl can do this using the (?p{code}) construct, which PCRE cannot -support.) -
-
-(h) PCRE supports named capturing substrings, using the Python syntax. -
+(g) The callout facility is PCRE-specific.
-(i) PCRE supports the possessive quantifier "++" syntax, taken from Sun's Java -package.
+(h) The partial matching facility is PCRE-specific.
-(j) The (R) condition, for testing recursion, is a PCRE extension.
-
-(k) The callout facility is PCRE-specific. -
-
-(l) The partial matching facility is PCRE-specific. +(i) Patterns compiled by PCRE can be saved and re-used at a later time, even on +different hosts that have the other endianness.

-(m) Patterns compiled by PCRE can be saved and re-used at a later time, even on -different hosts that have the other endianness. +(j) The alternative matching function (pcre_dfa_exec()) matches in a +different way and is not Perl-compatible.

-Last updated: 09 September 2004 +Last updated: 28 November 2006
-Copyright © 1997-2004 University of Cambridge. +Copyright © 1997-2006 University of Cambridge.

Return to the PCRE index page.