79 |
changes the convention to CR. That pattern matches "a\enb" because LF is no |
changes the convention to CR. That pattern matches "a\enb" because LF is no |
80 |
longer a newline. Note that these special settings, which are not |
longer a newline. Note that these special settings, which are not |
81 |
Perl-compatible, are recognized only at the very start of a pattern, and that |
Perl-compatible, are recognized only at the very start of a pattern, and that |
82 |
they must be in upper case. |
they must be in upper case. If more than one of them is present, the last one |
83 |
|
is used. |
84 |
|
.P |
85 |
|
The newline convention does not affect what the \eR escape sequence matches. By |
86 |
|
default, this is any Unicode newline sequence, for Perl compatibility. However, |
87 |
|
this can be changed; see the description of \eR in the section entitled |
88 |
|
.\" HTML <a href="#newlineseq"> |
89 |
|
.\" </a> |
90 |
|
"Newline sequences" |
91 |
|
.\" |
92 |
|
below. |
93 |
. |
. |
94 |
. |
. |
95 |
.SH "CHARACTERS AND METACHARACTERS" |
.SH "CHARACTERS AND METACHARACTERS" |
398 |
is discouraged. |
is discouraged. |
399 |
. |
. |
400 |
. |
. |
401 |
|
.\" HTML <a name="newlineseq"></a> |
402 |
.SS "Newline sequences" |
.SS "Newline sequences" |
403 |
.rs |
.rs |
404 |
.sp |
.sp |
405 |
Outside a character class, the escape sequence \eR matches any Unicode newline |
Outside a character class, by default, the escape sequence \eR matches any |
406 |
sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is equivalent to |
Unicode newline sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is |
407 |
the following: |
equivalent to the following: |
408 |
.sp |
.sp |
409 |
(?>\er\en|\en|\ex0b|\ef|\er|\ex85) |
(?>\er\en|\en|\ex0b|\ef|\er|\ex85) |
410 |
.sp |
.sp |
424 |
Unicode character property support is not needed for these characters to be |
Unicode character property support is not needed for these characters to be |
425 |
recognized. |
recognized. |
426 |
.P |
.P |
427 |
|
It is possible to restrict \eR to match only CR, LF, or CRLF (instead of the |
428 |
|
complete set of Unicode line endings) by setting the option PCRE_BSR_ANYCRLF |
429 |
|
either at compile time or when the pattern is matched. This can be made the |
430 |
|
default when PCRE is built; if this is the case, the other behaviour can be |
431 |
|
requested via the PCRE_BSR_UNICODE option. It is also possible to specify these |
432 |
|
settings by starting a pattern string with one of the following sequences: |
433 |
|
.sp |
434 |
|
(*BSR_ANYCRLF) CR, LF, or CRLF only |
435 |
|
(*BSR_UNICODE) any Unicode newline sequence |
436 |
|
.sp |
437 |
|
These override the default and the options given to \fBpcre_compile()\fP, but |
438 |
|
they can be overridden by options given to \fBpcre_exec()\fP. Note that these |
439 |
|
special settings, which are not Perl-compatible, are recognized only at the |
440 |
|
very start of a pattern, and that they must be in upper case. If more than one |
441 |
|
of them is present, the last one is used. |
442 |
|
.P |
443 |
Inside a character class, \eR matches the letter "R". |
Inside a character class, \eR matches the letter "R". |
444 |
. |
. |
445 |
. |
. |
987 |
.rs |
.rs |
988 |
.sp |
.sp |
989 |
The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and |
The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and |
990 |
PCRE_EXTENDED options can be changed from within the pattern by a sequence of |
PCRE_EXTENDED options (which are Perl-compatible) can be changed from within |
991 |
Perl option letters enclosed between "(?" and ")". The option letters are |
the pattern by a sequence of Perl option letters enclosed between "(?" and ")". |
992 |
|
The option letters are |
993 |
.sp |
.sp |
994 |
i for PCRE_CASELESS |
i for PCRE_CASELESS |
995 |
m for PCRE_MULTILINE |
m for PCRE_MULTILINE |
1003 |
permitted. If a letter appears both before and after the hyphen, the option is |
permitted. If a letter appears both before and after the hyphen, the option is |
1004 |
unset. |
unset. |
1005 |
.P |
.P |
1006 |
|
The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be |
1007 |
|
changed in the same way as the Perl-compatible options by using the characters |
1008 |
|
J, U and X respectively. |
1009 |
|
.P |
1010 |
When an option change occurs at top level (that is, not inside subpattern |
When an option change occurs at top level (that is, not inside subpattern |
1011 |
parentheses), the change applies to the remainder of the pattern that follows. |
parentheses), the change applies to the remainder of the pattern that follows. |
1012 |
If the change is placed right at the start of a pattern, PCRE extracts it into |
If the change is placed right at the start of a pattern, PCRE extracts it into |
1029 |
branch is abandoned before the option setting. This is because the effects of |
branch is abandoned before the option setting. This is because the effects of |
1030 |
option settings happen at compile time. There would be some very weird |
option settings happen at compile time. There would be some very weird |
1031 |
behaviour otherwise. |
behaviour otherwise. |
|
.P |
|
|
The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be |
|
|
changed in the same way as the Perl-compatible options by using the characters |
|
|
J, U and X respectively. |
|
1032 |
. |
. |
1033 |
. |
. |
1034 |
.\" HTML <a name="subpattern"></a> |
.\" HTML <a name="subpattern"></a> |
2177 |
.rs |
.rs |
2178 |
.sp |
.sp |
2179 |
.nf |
.nf |
2180 |
Last updated: 21 August 2007 |
Last updated: 11 September 2007 |
2181 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2007 University of Cambridge. |
2182 |
.fi |
.fi |