--- code/trunk/doc/html/pcrepattern.html 2007/09/10 13:23:56 230 +++ code/trunk/doc/html/pcrepattern.html 2007/09/11 11:15:33 231 @@ -105,7 +105,15 @@ changes the convention to CR. That pattern matches "a\nb" because LF is no longer a newline. Note that these special settings, which are not Perl-compatible, are recognized only at the very start of a pattern, and that -they must be in upper case. +they must be in upper case. If more than one of them is present, the last one +is used. ++
+The newline convention does not affect what the \R escape sequence matches. By +default, this is any Unicode newline sequence, for Perl compatibility. However, +this can be changed; see the description of \R in the section entitled +"Newline sequences" +below.
@@ -391,14 +399,14 @@ or "french" in Windows, some character codes greater than 128 are used for accented letters, and these are matched by \w. The use of locales with Unicode is discouraged. -+
-Outside a character class, the escape sequence \R matches any Unicode newline -sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \R is equivalent to -the following: +Outside a character class, by default, the escape sequence \R matches any +Unicode newline sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \R is +equivalent to the following:
(?>\r\n|\n|\x0b|\f|\r|\x85)@@ -417,6 +425,23 @@ recognized.
+It is possible to restrict \R to match only CR, LF, or CRLF (instead of the +complete set of Unicode line endings) by setting the option PCRE_BSR_ANYCRLF +either at compile time or when the pattern is matched. This can be made the +default when PCRE is built; if this is the case, the other behaviour can be +requested via the PCRE_BSR_UNICODE option. It is also possible to specify these +settings by starting a pattern string with one of the following sequences: +
+ (*BSR_ANYCRLF) CR, LF, or CRLF only + (*BSR_UNICODE) any Unicode newline sequence ++These override the default and the options given to pcre_compile(), but +they can be overridden by options given to pcre_exec(). Note that these +special settings, which are not Perl-compatible, are recognized only at the +very start of a pattern, and that they must be in upper case. If more than one +of them is present, the last one is used. + +
Inside a character class, \R matches the letter "R".
-Last updated: 21 August 2007
+Last updated: 11 September 2007
Copyright © 1997-2007 University of Cambridge.