--- code/trunk/doc/html/pcrecompat.html 2007/02/24 21:41:21 87 +++ code/trunk/doc/html/pcrecompat.html 2007/02/24 21:41:42 93 @@ -17,8 +17,9 @@

This document describes the differences in the ways that PCRE and Perl handle -regular expressions. The differences described here are with respect to Perl -5.8. +regular expressions. The differences described here are mainly with respect to +Perl 5.8, though PCRE version 7.0 contains some features that are expected to +be in the forthcoming Perl 5.10.

1. PCRE has only a subset of Perl's UTF-8 and Unicode support. Details of what @@ -76,20 +77,28 @@ The \Q...\E sequence is recognized both inside and outside character classes.

-8. Fairly obviously, PCRE does not support the (?{code}) and (?p{code}) -constructions. However, there is support for recursive patterns using the -non-Perl items (?R), (?number), and (?P>name). Also, the PCRE "callout" feature -allows an external function to be called during pattern matching. See the +8. Fairly obviously, PCRE does not support the (?{code}) and (??{code}) +constructions. However, there is support for recursive patterns. This is not +available in Perl 5.8, but will be in Perl 5.10. Also, the PCRE "callout" +feature allows an external function to be called during pattern matching. See +the pcrecallout documentation for details.

-9. There are some differences that are concerned with the settings of captured +9. Subpatterns that are called recursively or as "subroutines" are always +treated as atomic groups in PCRE. This is like Python, but unlike Perl. +

+

+10. There are some differences that are concerned with the settings of captured strings when part of a pattern is repeated. For example, matching "aba" against the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b".

-10. PCRE provides some extensions to the Perl regular expression facilities: +11. PCRE provides some extensions to the Perl regular expression facilities. +Perl 5.10 will include new features that are not in earlier versions, some of +which (such as named parentheses) have been in PCRE for some time. This list is +with respect to Perl 5.10:

(a) Although lookbehind assertions must match fixed length strings, each @@ -102,7 +111,8 @@

(c) If PCRE_EXTRA is set, a backslash followed by a letter with no special -meaning is faulted. +meaning is faulted. Otherwise, like Perl, the backslash is ignored. (Perl can +be made to issue a warning.)

(d) If PCRE_UNGREEDY is set, the greediness of the repetition quantifiers is @@ -118,36 +128,21 @@ options for pcre_exec() have no Perl equivalents.

-(g) The (?R), (?number), and (?P>name) constructs allows for recursive pattern -matching (Perl can do this using the (?p{code}) construct, which PCRE cannot -support.) -
-
-(h) PCRE supports named capturing substrings, using the Python syntax. -
-
-(i) PCRE supports the possessive quantifier "++" syntax, taken from Sun's Java -package. -
-
-(j) The (R) condition, for testing recursion, is a PCRE extension. -
-
-(k) The callout facility is PCRE-specific. +(g) The callout facility is PCRE-specific.

-(l) The partial matching facility is PCRE-specific. +(h) The partial matching facility is PCRE-specific.

-(m) Patterns compiled by PCRE can be saved and re-used at a later time, even on +(i) Patterns compiled by PCRE can be saved and re-used at a later time, even on different hosts that have the other endianness.

-(n) The alternative matching function (pcre_dfa_exec()) matches in a +(j) The alternative matching function (pcre_dfa_exec()) matches in a different way and is not Perl-compatible.

-Last updated: 24 January 2006 +Last updated: 28 November 2006
Copyright © 1997-2006 University of Cambridge.