55 |
page. |
page. |
56 |
</P> |
</P> |
57 |
<P> |
<P> |
58 |
|
The remainder of this document discusses the patterns that are supported by |
59 |
|
PCRE when its main matching function, <b>pcre_exec()</b>, is used. |
60 |
|
From release 6.0, PCRE offers a second matching function, |
61 |
|
<b>pcre_dfa_exec()</b>, which matches using a different algorithm that is not |
62 |
|
Perl-compatible. The advantages and disadvantages of the alternative function, |
63 |
|
and how it differs from the normal function, are discussed in the |
64 |
|
<a href="pcrematching.html"><b>pcrematching</b></a> |
65 |
|
page. |
66 |
|
</P> |
67 |
|
<P> |
68 |
A regular expression is a pattern that is matched against a subject string from |
A regular expression is a pattern that is matched against a subject string from |
69 |
left to right. Most characters stand for themselves in a pattern, and match the |
left to right. Most characters stand for themselves in a pattern, and match the |
70 |
corresponding characters in the subject. As a trivial example, the pattern |
corresponding characters in the subject. As a trivial example, the pattern |
71 |
<pre> |
<pre> |
72 |
The quick brown fox |
The quick brown fox |
73 |
</pre> |
</pre> |
74 |
matches a portion of a subject string that is identical to itself. The power of |
matches a portion of a subject string that is identical to itself. When |
75 |
regular expressions comes from the ability to include alternatives and |
caseless matching is specified (the PCRE_CASELESS option), letters are matched |
76 |
repetitions in the pattern. These are encoded in the pattern by the use of |
independently of case. In UTF-8 mode, PCRE always understands the concept of |
77 |
|
case for characters whose values are less than 128, so caseless matching is |
78 |
|
always possible. For characters with higher values, the concept of case is |
79 |
|
supported if PCRE is compiled with Unicode property support, but not otherwise. |
80 |
|
If you want to use caseless matching for characters 128 and above, you must |
81 |
|
ensure that PCRE is compiled with Unicode property support as well as with |
82 |
|
UTF-8 support. |
83 |
|
</P> |
84 |
|
<P> |
85 |
|
The power of regular expressions comes from the ability to include alternatives |
86 |
|
and repetitions in the pattern. These are encoded in the pattern by the use of |
87 |
<i>metacharacters</i>, which do not stand for themselves but instead are |
<i>metacharacters</i>, which do not stand for themselves but instead are |
88 |
interpreted in some special way. |
interpreted in some special way. |
89 |
</P> |
</P> |
556 |
When caseless matching is set, any letters in a class represent both their |
When caseless matching is set, any letters in a class represent both their |
557 |
upper case and lower case versions, so for example, a caseless [aeiou] matches |
upper case and lower case versions, so for example, a caseless [aeiou] matches |
558 |
"A" as well as "a", and a caseless [^aeiou] does not match "A", whereas a |
"A" as well as "a", and a caseless [^aeiou] does not match "A", whereas a |
559 |
caseful version would. When running in UTF-8 mode, PCRE supports the concept of |
caseful version would. In UTF-8 mode, PCRE always understands the concept of |
560 |
case for characters with values greater than 128 only when it is compiled with |
case for characters whose values are less than 128, so caseless matching is |
561 |
Unicode property support. |
always possible. For characters with higher values, the concept of case is |
562 |
|
supported if PCRE is compiled with Unicode property support, but not otherwise. |
563 |
|
If you want to use caseless matching for characters 128 and above, you must |
564 |
|
ensure that PCRE is compiled with Unicode property support as well as with |
565 |
|
UTF-8 support. |
566 |
</P> |
</P> |
567 |
<P> |
<P> |
568 |
The newline character is never treated in any special way in character classes, |
The newline character is never treated in any special way in character classes, |
1486 |
documentation. |
documentation. |
1487 |
</P> |
</P> |
1488 |
<P> |
<P> |
1489 |
Last updated: 09 September 2004 |
Last updated: 28 February 2005 |
1490 |
<br> |
<br> |
1491 |
Copyright © 1997-2004 University of Cambridge. |
Copyright © 1997-2005 University of Cambridge. |
1492 |
<p> |
<p> |
1493 |
Return to the <a href="index.html">PCRE index page</a>. |
Return to the <a href="index.html">PCRE index page</a>. |
1494 |
</p> |
</p> |