765 |
50 [this code is not in use] |
50 [this code is not in use] |
766 |
51 octal value is greater than \e377 (not in UTF-8 mode) |
51 octal value is greater than \e377 (not in UTF-8 mode) |
767 |
52 internal error: overran compiling workspace |
52 internal error: overran compiling workspace |
768 |
53 internal error: previously-checked referenced subpattern not found |
53 internal error: previously-checked referenced subpattern |
769 |
|
not found |
770 |
54 DEFINE group contains more than one branch |
54 DEFINE group contains more than one branch |
771 |
55 repeating a DEFINE group is not allowed |
55 repeating a DEFINE group is not allowed |
772 |
56 inconsistent NEWLINE options |
56 inconsistent NEWLINE options |
779 |
62 subpattern name expected |
62 subpattern name expected |
780 |
63 digit expected after (?+ |
63 digit expected after (?+ |
781 |
64 ] is an invalid data character in JavaScript compatibility mode |
64 ] is an invalid data character in JavaScript compatibility mode |
782 |
65 different names for subpatterns of the same number are not allowed |
65 different names for subpatterns of the same number are |
783 |
|
not allowed |
784 |
66 (*MARK) must have an argument |
66 (*MARK) must have an argument |
785 |
67 this version of PCRE is not compiled with PCRE_UCP support |
67 this version of PCRE is not compiled with PCRE_UCP support |
786 |
.sp |
.sp |
1450 |
for that character, and fails immediately if it cannot find it, without |
for that character, and fails immediately if it cannot find it, without |
1451 |
actually running the main matching function. This means that a special item |
actually running the main matching function. This means that a special item |
1452 |
such as (*COMMIT) at the start of a pattern is not considered until after a |
such as (*COMMIT) at the start of a pattern is not considered until after a |
1453 |
suitable starting point for the match has been found. When callouts are in use, |
suitable starting point for the match has been found. When callouts or (*MARK) |
1454 |
these "start-up" optimizations can cause them to be skipped if the pattern is |
items are in use, these "start-up" optimizations can cause them to be skipped |
1455 |
never actually used. The PCRE_NO_START_OPTIMIZE option disables the start-up |
if the pattern is never actually used. The start-up optimizations are in effect |
1456 |
optimizations, causing performance to suffer, but ensuring that the callouts do |
a pre-scan of the subject that takes place before the pattern is run. |
1457 |
occur, and that items such as (*COMMIT) are considered at every possible |
.P |
1458 |
starting position in the subject string. |
The PCRE_NO_START_OPTIMIZE option disables the start-up optimizations, possibly |
1459 |
|
causing performance to suffer, but ensuring that in cases where the result is |
1460 |
|
"no match", the callouts do occur, and that items such as (*COMMIT) and (*MARK) |
1461 |
|
are considered at every possible starting position in the subject string. |
1462 |
|
Setting PCRE_NO_START_OPTIMIZE can change the outcome of a matching operation. |
1463 |
|
Consider the pattern |
1464 |
|
.sp |
1465 |
|
(*COMMIT)ABC |
1466 |
|
.sp |
1467 |
|
When this is compiled, PCRE records the fact that a match must start with the |
1468 |
|
character "A". Suppose the subject string is "DEFABC". The start-up |
1469 |
|
optimization scans along the subject, finds "A" and runs the first match |
1470 |
|
attempt from there. The (*COMMIT) item means that the pattern must match the |
1471 |
|
current starting position, which in this case, it does. However, if the same |
1472 |
|
match is run with PCRE_NO_START_OPTIMIZE set, the initial scan along the |
1473 |
|
subject string does not happen. The first match attempt is run starting from |
1474 |
|
"D" and when this fails, (*COMMIT) prevents any further matches being tried, so |
1475 |
|
the overall result is "no match". If the pattern is studied, more start-up |
1476 |
|
optimizations may be used. For example, a minimum length for the subject may be |
1477 |
|
recorded. Consider the pattern |
1478 |
|
.sp |
1479 |
|
(*MARK:A)(X|Y) |
1480 |
|
.sp |
1481 |
|
The minimum length for a match is one character. If the subject is "ABC", there |
1482 |
|
will be attempts to match "ABC", "BC", "C", and then finally an empty string. |
1483 |
|
If the pattern is studied, the final attempt does not take place, because PCRE |
1484 |
|
knows that the subject is too short, and so the (*MARK) is never encountered. |
1485 |
|
In this case, studying the pattern does not affect the overall match result, |
1486 |
|
which is still "no match", but it does affect the auxiliary information that is |
1487 |
|
returned. |
1488 |
.sp |
.sp |
1489 |
PCRE_NO_UTF8_CHECK |
PCRE_NO_UTF8_CHECK |
1490 |
.sp |
.sp |
2168 |
.rs |
.rs |
2169 |
.sp |
.sp |
2170 |
.nf |
.nf |
2171 |
Last updated: 15 June 2010 |
Last updated: 20 June 2010 |
2172 |
Copyright (c) 1997-2010 University of Cambridge. |
Copyright (c) 1997-2010 University of Cambridge. |
2173 |
.fi |
.fi |