24 |
<P> |
<P> |
25 |
The PCRE library is a set of functions that implement regular expression |
The PCRE library is a set of functions that implement regular expression |
26 |
pattern matching using the same syntax and semantics as Perl, with just a few |
pattern matching using the same syntax and semantics as Perl, with just a few |
27 |
differences. (Certain features that appeared in Python and PCRE before they |
differences. Certain features that appeared in Python and PCRE before they |
28 |
appeared in Perl are also available using the Python syntax.) |
appeared in Perl are also available using the Python syntax. There is also some |
29 |
|
support for certain .NET and Oniguruma syntax items, and there is an option for |
30 |
|
requesting some minor changes that give better JavaScript compatibility. |
31 |
</P> |
</P> |
32 |
<P> |
<P> |
33 |
The current implementation of PCRE (release 7.x) corresponds approximately with |
The current implementation of PCRE (release 7.x) corresponds approximately with |
34 |
Perl 5.10, including support for UTF-8 encoded strings and Unicode general |
Perl 5.10, including support for UTF-8 encoded strings and Unicode general |
35 |
category properties. However, UTF-8 and Unicode support has to be explicitly |
category properties. However, UTF-8 and Unicode support has to be explicitly |
36 |
enabled; it is not the default. The Unicode tables correspond to Unicode |
enabled; it is not the default. The Unicode tables correspond to Unicode |
37 |
release 5.0.0. |
release 5.1. |
38 |
</P> |
</P> |
39 |
<P> |
<P> |
40 |
In addition to the Perl-compatible matching function, PCRE contains an |
In addition to the Perl-compatible matching function, PCRE contains an |
160 |
In order process UTF-8 strings, you must build PCRE to include UTF-8 support in |
In order process UTF-8 strings, you must build PCRE to include UTF-8 support in |
161 |
the code, and, in addition, you must call |
the code, and, in addition, you must call |
162 |
<a href="pcre_compile.html"><b>pcre_compile()</b></a> |
<a href="pcre_compile.html"><b>pcre_compile()</b></a> |
163 |
with the PCRE_UTF8 option flag. When you do this, both the pattern and any |
with the PCRE_UTF8 option flag, or the pattern must start with the sequence |
164 |
subject strings that are matched against it are treated as UTF-8 strings |
(*UTF8). When either of these is the case, both the pattern and any subject |
165 |
instead of just strings of bytes. |
strings that are matched against it are treated as UTF-8 strings instead of |
166 |
|
just strings of bytes. |
167 |
</P> |
</P> |
168 |
<P> |
<P> |
169 |
If you compile PCRE with UTF-8 support, but do not use it at run time, the |
If you compile PCRE with UTF-8 support, but do not use it at run time, the |
259 |
values less than 256. This remains true even when PCRE includes Unicode |
values less than 256. This remains true even when PCRE includes Unicode |
260 |
property support, because to do otherwise would slow down PCRE in many common |
property support, because to do otherwise would slow down PCRE in many common |
261 |
cases. If you really want to test for a wider sense of, say, "digit", you |
cases. If you really want to test for a wider sense of, say, "digit", you |
262 |
must use Unicode property tests such as \p{Nd}. |
must use Unicode property tests such as \p{Nd}. Note that this also applies to |
263 |
|
\b, because it is defined in terms of \w and \W. |
264 |
</P> |
</P> |
265 |
<P> |
<P> |
266 |
7. Similarly, characters that match the POSIX named character classes are all |
7. Similarly, characters that match the POSIX named character classes are all |
297 |
</P> |
</P> |
298 |
<br><a name="SEC6" href="#TOC1">REVISION</a><br> |
<br><a name="SEC6" href="#TOC1">REVISION</a><br> |
299 |
<P> |
<P> |
300 |
Last updated: 09 August 2007 |
Last updated: 11 April 2009 |
301 |
<br> |
<br> |
302 |
Copyright © 1997-2007 University of Cambridge. |
Copyright © 1997-2009 University of Cambridge. |
303 |
<br> |
<br> |
304 |
<p> |
<p> |
305 |
Return to the <a href="index.html">PCRE index page</a>. |
Return to the <a href="index.html">PCRE index page</a>. |