3 |
It also checks the non-Perl syntax the PCRE supports (Python, .NET, |
It also checks the non-Perl syntax the PCRE supports (Python, .NET, |
4 |
Oniguruma). Finally, there are some tests where PCRE and Perl differ, |
Oniguruma). Finally, there are some tests where PCRE and Perl differ, |
5 |
either because PCRE can't be compatible, or there is a possible Perl |
either because PCRE can't be compatible, or there is a possible Perl |
6 |
bug. --/ |
bug. |
7 |
|
|
8 |
|
NOTE: This is a non-UTF-8 set of tests. When UTF-8 is needed, use test |
9 |
|
5, and if Unicode Property Support is needed, use test 13. --/ |
10 |
|
|
11 |
/-- Originally, the Perl >= 5.10 things were in here too, but now I have |
/-- Originally, the Perl >= 5.10 things were in here too, but now I have |
12 |
separated many (most?) of them out into test 11. However, there may still |
separated many (most?) of them out into test 11. However, there may still |
10992 |
AC |
AC |
10993 |
No match |
No match |
10994 |
|
|
|
/--- A whole lot of tests of verbs with arguments are here rather than in test |
|
|
11 because Perl doesn't seem to follow its specification entirely |
|
|
correctly. ---/ |
|
|
|
|
|
/--- Perl 5.11 sets $REGERROR on the AC failure case here; PCRE does not. It is |
|
|
not clear how Perl defines "involved in the failure of the match". ---/ |
|
|
|
|
|
/^(A(*THEN:A)B|C(*THEN:B)D)/K |
|
|
AB |
|
|
0: AB |
|
|
1: AB |
|
|
CD |
|
|
0: CD |
|
|
1: CD |
|
|
** Failers |
|
|
No match |
|
|
AC |
|
|
No match |
|
|
CB |
|
|
No match, mark = B |
|
|
|
|
|
/--- Check the use of names for success and failure. PCRE doesn't show these |
|
|
names for success, though Perl does, contrary to its spec. ---/ |
|
|
|
|
|
/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K |
|
|
AB |
|
|
0: AB |
|
|
1: AB |
|
|
CD |
|
|
0: CD |
|
|
1: CD |
|
|
** Failers |
|
|
No match |
|
|
AC |
|
|
No match, mark = A |
|
|
CB |
|
|
No match, mark = B |
|
|
|
|
|
/--- An empty name does not pass back an empty string. It is the same as if no |
|
|
name were given. ---/ |
|
|
|
|
|
/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K |
|
|
AB |
|
|
0: AB |
|
|
1: AB |
|
|
CD |
|
|
0: CD |
|
|
1: CD |
|
|
|
|
|
/--- PRUNE goes to next bumpalong; COMMIT does not. ---/ |
|
|
|
|
|
/A(*PRUNE:A)B/K |
|
|
ACAB |
|
|
0: AB |
|
|
|
|
|
/(*MARK:A)(*PRUNE:B)(C|X)/KS |
|
|
C |
|
|
0: C |
|
|
1: C |
|
|
MK: A |
|
|
D |
|
|
No match |
|
|
|
|
|
/(*MARK:A)(*PRUNE:B)(C|X)/KSS |
|
|
C |
|
|
0: C |
|
|
1: C |
|
|
MK: A |
|
|
D |
|
|
No match, mark = B |
|
|
|
|
|
/(*MARK:A)(*THEN:B)(C|X)/KS |
|
|
C |
|
|
0: C |
|
|
1: C |
|
|
MK: A |
|
|
D |
|
|
No match |
|
|
|
|
|
/(*MARK:A)(*THEN:B)(C|X)/KSY |
|
|
C |
|
|
0: C |
|
|
1: C |
|
|
MK: A |
|
|
D |
|
|
No match, mark = B |
|
|
|
|
|
/(*MARK:A)(*THEN:B)(C|X)/KSS |
|
|
C |
|
|
0: C |
|
|
1: C |
|
|
MK: A |
|
|
D |
|
|
No match, mark = B |
|
|
|
|
|
/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/ |
|
|
|
|
|
/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK |
|
|
AAAC |
|
|
No match |
|
|
|
|
|
/--- Same --/ |
|
|
|
|
|
/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK |
|
|
AAAC |
|
|
No match |
|
|
|
|
10995 |
/--- This should fail; the SKIP advances by one, but when we get to AC, the |
/--- This should fail; the SKIP advances by one, but when we get to AC, the |
10996 |
PRUNE kills it. ---/ |
PRUNE kills it. Perl behaves differently. ---/ |
10997 |
|
|
10998 |
/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xK |
/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xK |
10999 |
AAAC |
AAAC |
11000 |
No match |
No match, mark = A |
|
|
|
|
/A(*:A)A+(*SKIP)(B|Z) | AC/xK |
|
|
AAAC |
|
|
No match |
|
|
|
|
|
/--- This should fail, as a null name is the same as no name ---/ |
|
|
|
|
|
/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK |
|
|
AAAC |
|
|
No match |
|
|
|
|
|
/--- This fails in PCRE, and I think that is in accordance with Perl's |
|
|
documentation, though in Perl it succeeds. ---/ |
|
|
|
|
|
/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK |
|
|
AAAC |
|
|
No match |
|
11001 |
|
|
11002 |
/--- Mark names can be duplicated ---/ |
/--- Mark names can be duplicated. Perl doesn't give a mark for this one, |
11003 |
|
though PCRE does. ---/ |
11004 |
|
|
|
/A(*:A)B|X(*:A)Y/K |
|
|
AABC |
|
|
0: AB |
|
|
MK: A |
|
|
XXYZ |
|
|
0: XY |
|
|
MK: A |
|
|
|
|
11005 |
/^A(*:A)B|^X(*:A)Y/K |
/^A(*:A)B|^X(*:A)Y/K |
11006 |
** Failers |
** Failers |
11007 |
No match |
No match |
11008 |
XAQQ |
XAQQ |
11009 |
No match, mark = A |
No match, mark = A |
11010 |
|
|
|
/--- A check on what happens after hitting a mark and them bumping along to |
|
|
something that does not even start. Perl reports tags after the failures here, |
|
|
though it does not when the individual letters are made into something |
|
|
more complicated. ---/ |
|
|
|
|
|
/A(*:A)B|XX(*:B)Y/K |
|
|
AABC |
|
|
0: AB |
|
|
MK: A |
|
|
XXYZ |
|
|
0: XXY |
|
|
MK: B |
|
|
** Failers |
|
|
No match |
|
|
XAQQ |
|
|
No match |
|
|
XAQQXZZ |
|
|
No match |
|
|
AXQQQ |
|
|
No match |
|
|
AXXQQQ |
|
|
No match |
|
|
|
|
11011 |
/--- COMMIT at the start of a pattern should be the same as an anchor. Perl |
/--- COMMIT at the start of a pattern should be the same as an anchor. Perl |
11012 |
optimizations defeat this. So does the PCRE optimization unless we disable it |
optimizations defeat this. So does the PCRE optimization unless we disable it |
11013 |
with \Y. ---/ |
with \Y. ---/ |
11020 |
DEFGABC\Y |
DEFGABC\Y |
11021 |
No match |
No match |
11022 |
|
|
|
/--- Repeat some tests with added studying. ---/ |
|
|
|
|
|
/A(*COMMIT)B/+KS |
|
|
ACABX |
|
|
No match |
|
|
|
|
|
/A(*THEN)B|A(*THEN)C/KS |
|
|
AC |
|
|
0: AC |
|
|
|
|
|
/A(*PRUNE)B|A(*PRUNE)C/KS |
|
|
AC |
|
|
No match |
|
|
|
|
|
/^(A(*THEN:A)B|C(*THEN:B)D)/KS |
|
|
AB |
|
|
0: AB |
|
|
1: AB |
|
|
CD |
|
|
0: CD |
|
|
1: CD |
|
|
** Failers |
|
|
No match |
|
|
AC |
|
|
No match |
|
|
CB |
|
|
No match, mark = B |
|
|
|
|
|
/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/KS |
|
|
AB |
|
|
0: AB |
|
|
1: AB |
|
|
CD |
|
|
0: CD |
|
|
1: CD |
|
|
** Failers |
|
|
No match |
|
|
AC |
|
|
No match, mark = A |
|
|
CB |
|
|
No match, mark = B |
|
|
|
|
|
/^(A(*PRUNE:)B|C(*PRUNE:B)D)/KS |
|
|
AB |
|
|
0: AB |
|
|
1: AB |
|
|
CD |
|
|
0: CD |
|
|
1: CD |
|
|
|
|
|
/A(*PRUNE:A)B/KS |
|
|
ACAB |
|
|
0: AB |
|
|
|
|
|
/(*MARK:A)(*PRUNE:B)(C|X)/KS |
|
|
C |
|
|
0: C |
|
|
1: C |
|
|
MK: A |
|
|
D |
|
|
No match |
|
|
|
|
|
/(*MARK:A)(*THEN:B)(C|X)/KS |
|
|
C |
|
|
0: C |
|
|
1: C |
|
|
MK: A |
|
|
D |
|
|
No match |
|
|
|
|
|
/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xKS |
|
|
AAAC |
|
|
No match |
|
|
|
|
|
/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xKS |
|
|
AAAC |
|
|
No match |
|
|
|
|
|
/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xKS |
|
|
AAAC |
|
|
No match |
|
|
|
|
|
/A(*:A)A+(*SKIP)(B|Z) | AC/xKS |
|
|
AAAC |
|
|
No match |
|
|
|
|
|
/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xKS |
|
|
AAAC |
|
|
No match |
|
|
|
|
|
/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xKS |
|
|
AAAC |
|
|
No match |
|
|
|
|
|
/A(*:A)B|XX(*:B)Y/KS |
|
|
AABC |
|
|
0: AB |
|
|
MK: A |
|
|
XXYZ |
|
|
0: XXY |
|
|
MK: B |
|
|
** Failers |
|
|
No match |
|
|
XAQQ |
|
|
No match |
|
|
XAQQXZZ |
|
|
No match |
|
|
AXQQQ |
|
|
No match |
|
|
AXXQQQ |
|
|
No match |
|
|
|
|
|
/(*COMMIT)ABC/ |
|
|
ABCDEFG |
|
|
0: ABC |
|
|
** Failers |
|
|
No match |
|
|
DEFGABC\Y |
|
|
No match |
|
|
|
|
11023 |
/^(ab (c+(*THEN)cd) | xyz)/x |
/^(ab (c+(*THEN)cd) | xyz)/x |
11024 |
abcccd |
abcccd |
11025 |
No match |
No match |
11604 |
1: C |
1: C |
11605 |
MK: A |
MK: A |
11606 |
D |
D |
11607 |
No match |
No match, mark = A |
11608 |
|
|
11609 |
/(*:A)A+(*SKIP:A)(B|Z)/KS |
/(*:A)A+(*SKIP:A)(B|Z)/KS |
11610 |
AAAC |
AAAC |
11611 |
No match |
No match, mark = A |
11612 |
|
|
11613 |
/-- --/ |
/-- --/ |
11614 |
|
|
11986 |
Latest Mark: B |
Latest Mark: B |
11987 |
+18 ^ ^ z |
+18 ^ ^ z |
11988 |
+20 ^ a |
+20 ^ a |
|
Latest Mark: <unset> |
|
11989 |
+21 ^^ e |
+21 ^^ e |
11990 |
+22 ^ ^ q |
+22 ^ ^ q |
11991 |
+23 ^ ^ ) |
+23 ^ ^ ) |
12246 |
ax1z |
ax1z |
12247 |
0: ax1z |
0: ax1z |
12248 |
|
|
|
/^a\X41z/<JS> |
|
|
aX41z |
|
|
0: aX41z |
|
|
*** Failers |
|
|
No match |
|
|
aAz |
|
|
No match |
|
|
|
|
12249 |
/^a\u0041z/<JS> |
/^a\u0041z/<JS> |
12250 |
aAz |
aAz |
12251 |
0: aAz |
0: aAz |
12311 |
End |
End |
12312 |
------------------------------------------------------------------ |
------------------------------------------------------------------ |
12313 |
|
|
12314 |
/(?<=ab\Cde)X/8 |
/a[\NB]c/ |
12315 |
Failed: \C not allowed in lookbehind assertion at offset 10 |
Failed: \N is not supported in a class at offset 3 |
12316 |
|
|
12317 |
|
/a[B-\Nc]/ |
12318 |
|
Failed: \N is not supported in a class at offset 5 |
12319 |
|
|
12320 |
|
/(a)(?2){0,1999}?(b)/ |
12321 |
|
|
12322 |
|
/(a)(?(DEFINE)(b))(?2){0,1999}?(?2)/ |
12323 |
|
|
12324 |
|
/--- This test, with something more complicated than individual letters, causes |
12325 |
|
different behaviour in Perl. Perhaps it disables some optimization; no tag is |
12326 |
|
passed back for the failures, whereas in PCRE there is a tag. ---/ |
12327 |
|
|
12328 |
|
/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK |
12329 |
|
AABC |
12330 |
|
0: AB |
12331 |
|
1: A |
12332 |
|
2: B |
12333 |
|
MK: A |
12334 |
|
XXYZ |
12335 |
|
0: XXY |
12336 |
|
1: <unset> |
12337 |
|
2: <unset> |
12338 |
|
3: X |
12339 |
|
4: X |
12340 |
|
5: Y |
12341 |
|
MK: B |
12342 |
|
** Failers |
12343 |
|
No match |
12344 |
|
XAQQ |
12345 |
|
No match, mark = A |
12346 |
|
XAQQXZZ |
12347 |
|
No match, mark = A |
12348 |
|
AXQQQ |
12349 |
|
No match, mark = A |
12350 |
|
AXXQQQ |
12351 |
|
No match, mark = B |
12352 |
|
|
12353 |
|
/-- Perl doesn't give marks for these, though it does if the alternatives are |
12354 |
|
replaced by single letters. --/ |
12355 |
|
|
12356 |
|
/(b|q)(*:m)f|a(*:n)w/K |
12357 |
|
aw |
12358 |
|
0: aw |
12359 |
|
MK: n |
12360 |
|
** Failers |
12361 |
|
No match, mark = n |
12362 |
|
abc |
12363 |
|
No match, mark = m |
12364 |
|
|
12365 |
|
/(q|b)(*:m)f|a(*:n)w/K |
12366 |
|
aw |
12367 |
|
0: aw |
12368 |
|
MK: n |
12369 |
|
** Failers |
12370 |
|
No match, mark = n |
12371 |
|
abc |
12372 |
|
No match, mark = m |
12373 |
|
|
12374 |
|
/-- After a partial match, the behaviour is as for a failure. --/ |
12375 |
|
|
12376 |
|
/^a(*:X)bcde/K |
12377 |
|
abc\P |
12378 |
|
Partial match, mark=X: abc |
12379 |
|
|
12380 |
/-- End of testinput2 --/ |
/-- End of testinput2 --/ |