3047 |
Unicode character properties |
Unicode character properties |
3048 |
|
|
3049 |
When PCRE is built with Unicode character property support, three addi- |
When PCRE is built with Unicode character property support, three addi- |
3050 |
tional escape sequences to match character properties are available |
tional escape sequences that match characters with specific properties |
3051 |
when UTF-8 mode is selected. They are: |
are available. When not in UTF-8 mode, these sequences are of course |
3052 |
|
limited to testing characters whose codepoints are less than 256, but |
3053 |
|
they do work in this mode. The extra escape sequences are: |
3054 |
|
|
3055 |
\p{xx} a character with the xx property |
\p{xx} a character with the xx property |
3056 |
\P{xx} a character without the xx property |
\P{xx} a character without the xx property |
3164 |
That is, it matches a character without the "mark" property, followed |
That is, it matches a character without the "mark" property, followed |
3165 |
by zero or more characters with the "mark" property, and treats the |
by zero or more characters with the "mark" property, and treats the |
3166 |
sequence as an atomic group (see below). Characters with the "mark" |
sequence as an atomic group (see below). Characters with the "mark" |
3167 |
property are typically accents that affect the preceding character. |
property are typically accents that affect the preceding character. |
3168 |
|
None of them have codepoints less than 256, so in non-UTF-8 mode \X |
3169 |
|
matches any one character. |
3170 |
|
|
3171 |
Matching characters by Unicode property is not fast, because PCRE has |
Matching characters by Unicode property is not fast, because PCRE has |
3172 |
to search a structure that contains data for over fifteen thousand |
to search a structure that contains data for over fifteen thousand |
4543 |
|
|
4544 |
REVISION |
REVISION |
4545 |
|
|
4546 |
Last updated: 13 June 2007 |
Last updated: 19 June 2007 |
4547 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2007 University of Cambridge. |
4548 |
------------------------------------------------------------------------------ |
------------------------------------------------------------------------------ |
4549 |
|
|