1 |
ChangeLog for PCRE |
ChangeLog for PCRE |
2 |
------------------ |
------------------ |
3 |
|
|
4 |
Version 8.33 xx-xxxx-201x |
Version 8.33 28-April-2013 |
5 |
------------------------- |
-------------------------- |
6 |
|
|
7 |
1. Added 'U' to some constants that are compared to unsigned integers, to |
1. Added 'U' to some constants that are compared to unsigned integers, to |
8 |
avoid compiler signed/unsigned warnings. Added (int) casts to unsigned |
avoid compiler signed/unsigned warnings. Added (int) casts to unsigned |
14 |
|
|
15 |
3. Revise the creation of config.h.generic so that all boolean macros are |
3. Revise the creation of config.h.generic so that all boolean macros are |
16 |
#undefined, whereas non-boolean macros are #ifndef/#endif-ed. This makes |
#undefined, whereas non-boolean macros are #ifndef/#endif-ed. This makes |
17 |
overriding via -D on the command line possible. |
overriding via -D on the command line possible. |
18 |
|
|
19 |
4. Changing the definition of the variable "op" in pcre_exec.c from pcre_uchar |
4. Changing the definition of the variable "op" in pcre_exec.c from pcre_uchar |
20 |
to unsigned int is reported to make a quite noticeable speed difference in |
to unsigned int is reported to make a quite noticeable speed difference in |
21 |
a specific Windows environment. Testing on Linux did also appear to show |
a specific Windows environment. Testing on Linux did also appear to show |
22 |
some benefit (and it is clearly not harmful). Also fixed the definition of |
some benefit (and it is clearly not harmful). Also fixed the definition of |
23 |
Xop which should be unsigned. |
Xop which should be unsigned. |
24 |
|
|
25 |
5. Related to (4), changing the definition of the intermediate variable cc |
5. Related to (4), changing the definition of the intermediate variable cc |
26 |
in repeated character loops from pcre_uchar to pcre_uint32 also gave speed |
in repeated character loops from pcre_uchar to pcre_uint32 also gave speed |
28 |
|
|
29 |
6. Fix forward search in JIT when link size is 3 or greater. Also removed some |
6. Fix forward search in JIT when link size is 3 or greater. Also removed some |
30 |
unnecessary spaces. |
unnecessary spaces. |
31 |
|
|
32 |
7. Adjust autogen.sh and configure.ac to lose warnings given by automake 1.12 |
7. Adjust autogen.sh and configure.ac to lose warnings given by automake 1.12 |
33 |
and later. |
and later. |
34 |
|
|
35 |
8. Fix two buffer over read issues in 16 and 32 bit modes. Affects JIT only. |
8. Fix two buffer over read issues in 16 and 32 bit modes. Affects JIT only. |
36 |
|
|
44 |
(b) Minimum length was not checked before the matching is started. |
(b) Minimum length was not checked before the matching is started. |
45 |
|
|
46 |
11. The value of capture_last that is passed to callouts was incorrect in some |
11. The value of capture_last that is passed to callouts was incorrect in some |
47 |
cases when there was a capture on one path that was subsequently abandoned |
cases when there was a capture on one path that was subsequently abandoned |
48 |
after a backtrack. Also, the capture_last value is now reset after a |
after a backtrack. Also, the capture_last value is now reset after a |
49 |
recursion, since all captures are also reset in this case. |
recursion, since all captures are also reset in this case. |
50 |
|
|
51 |
12. The interpreter no longer returns the "too many substrings" error in the |
12. The interpreter no longer returns the "too many substrings" error in the |
52 |
case when an overflowing capture is in a branch that is subsequently |
case when an overflowing capture is in a branch that is subsequently |
53 |
abandoned after a backtrack. |
abandoned after a backtrack. |
54 |
|
|
55 |
13. In the pathological case when an offset vector of size 2 is used, pcretest |
13. In the pathological case when an offset vector of size 2 is used, pcretest |
56 |
now prints out the matched string after a yield of 0 or 1. |
now prints out the matched string after a yield of 0 or 1. |
57 |
|
|
58 |
14. Inlining subpatterns in recursions, when certain conditions are fulfilled. |
14. Inlining subpatterns in recursions, when certain conditions are fulfilled. |
61 |
15. JIT compiler now supports 32 bit Macs thanks to Lawrence Velazquez. |
15. JIT compiler now supports 32 bit Macs thanks to Lawrence Velazquez. |
62 |
|
|
63 |
16. Partial matches now set offsets[2] to the "bumpalong" value, that is, the |
16. Partial matches now set offsets[2] to the "bumpalong" value, that is, the |
64 |
offset of the starting point of the matching process, provided the offsets |
offset of the starting point of the matching process, provided the offsets |
65 |
vector is large enough. |
vector is large enough. |
66 |
|
|
67 |
17. The \A escape now records a lookbehind value of 1, though its execution |
17. The \A escape now records a lookbehind value of 1, though its execution |
68 |
does not actually inspect the previous character. This is to ensure that, |
does not actually inspect the previous character. This is to ensure that, |
69 |
in partial multi-segment matching, at least one character from the old |
in partial multi-segment matching, at least one character from the old |
70 |
segment is retained when a new segment is processed. Otherwise, if there |
segment is retained when a new segment is processed. Otherwise, if there |
71 |
are no lookbehinds in the pattern, \A might match incorrectly at the start |
are no lookbehinds in the pattern, \A might match incorrectly at the start |
72 |
of a new segment. |
of a new segment. |
73 |
|
|
74 |
18. Added some #ifdef __VMS code into pcretest.c to help VMS implementations. |
18. Added some #ifdef __VMS code into pcretest.c to help VMS implementations. |
75 |
|
|
76 |
19. Redefined some pcre_uchar variables in pcre_exec.c as pcre_uint32; this |
19. Redefined some pcre_uchar variables in pcre_exec.c as pcre_uint32; this |
77 |
gives some modest performance improvement in 8-bit mode. |
gives some modest performance improvement in 8-bit mode. |
78 |
|
|
79 |
20. Added the PCRE-specific property \p{Xuc} for matching characters that can |
20. Added the PCRE-specific property \p{Xuc} for matching characters that can |
80 |
be expressed in certain programming languages using Universal Character |
be expressed in certain programming languages using Universal Character |
81 |
Names. |
Names. |
82 |
|
|
83 |
21. Unicode validation has been updated in the light of Unicode Corrigendum #9, |
21. Unicode validation has been updated in the light of Unicode Corrigendum #9, |
84 |
which points out that "non characters" are not "characters that may not |
which points out that "non characters" are not "characters that may not |
85 |
appear in Unicode strings" but rather "characters that are reserved for |
appear in Unicode strings" but rather "characters that are reserved for |
86 |
internal use and have only local meaning". |
internal use and have only local meaning". |
87 |
|
|
88 |
22. When a pattern was compiled with automatic callouts (PCRE_AUTO_CALLOUT) and |
22. When a pattern was compiled with automatic callouts (PCRE_AUTO_CALLOUT) and |
89 |
there was a conditional group that depended on an assertion, if the |
there was a conditional group that depended on an assertion, if the |
90 |
assertion was false, the callout that immediately followed the alternation |
assertion was false, the callout that immediately followed the alternation |
91 |
in the condition was skipped when pcre_exec() was used for matching. |
in the condition was skipped when pcre_exec() was used for matching. |
92 |
|
|
93 |
23. Allow an explicit callout to be inserted before an assertion that is the |
23. Allow an explicit callout to be inserted before an assertion that is the |
94 |
condition for a conditional group, for compatibility with automatic |
condition for a conditional group, for compatibility with automatic |
95 |
callouts, which always insert a callout at this point. |
callouts, which always insert a callout at this point. |
96 |
|
|
97 |
24. In 8.31, (*COMMIT) was confined to within a recursive subpattern. Perl also |
24. In 8.31, (*COMMIT) was confined to within a recursive subpattern. Perl also |
98 |
confines (*SKIP) and (*PRUNE) in the same way, and this has now been done. |
confines (*SKIP) and (*PRUNE) in the same way, and this has now been done. |
99 |
|
|
100 |
25. (*PRUNE) is now supported by the JIT compiler. |
25. (*PRUNE) is now supported by the JIT compiler. |
101 |
|
|
102 |
26. Fix infinite loop when /(?<=(*SKIP)ac)a/ is matched against aa. |
26. Fix infinite loop when /(?<=(*SKIP)ac)a/ is matched against aa. |
103 |
|
|
104 |
27. Fix the case where there are two or more SKIPs with arguments that may be |
27. Fix the case where there are two or more SKIPs with arguments that may be |
105 |
ignored. |
ignored. |
106 |
|
|
107 |
|
28. (*SKIP) is now supported by the JIT compiler. |
108 |
|
|
109 |
|
29. (*THEN) is now supported by the JIT compiler. |
110 |
|
|
111 |
|
30. Update RunTest with additional test selector options. |
112 |
|
|
113 |
|
31. The way PCRE handles backtracking verbs has been changed in two ways. |
114 |
|
|
115 |
|
(1) Previously, in something like (*COMMIT)(*SKIP), COMMIT would override |
116 |
|
SKIP. Now, PCRE acts on whichever backtracking verb is reached first by |
117 |
|
backtracking. In some cases this makes it more Perl-compatible, but Perl's |
118 |
|
rather obscure rules do not always do the same thing. |
119 |
|
|
120 |
|
(2) Previously, backtracking verbs were confined within assertions. This is |
121 |
|
no longer the case for positive assertions, except for (*ACCEPT). Again, |
122 |
|
this sometimes improves Perl compatibility, and sometimes does not. |
123 |
|
|
124 |
|
32. A number of tests that were in test 2 because Perl did things differently |
125 |
|
have been moved to test 1, because either Perl or PCRE has changed, and |
126 |
|
these tests are now compatible. |
127 |
|
|
128 |
|
32. Control verbs are handled in the same way in JIT and interpreter. |
129 |
|
|
130 |
|
33. An opening parenthesis in a MARK/PRUNE/SKIP/THEN name in a pattern that |
131 |
|
contained a forward subroutine reference caused a compile error. |
132 |
|
|
133 |
|
34. Auto-detect and optimize limited repetitions in JIT. |
134 |
|
|
135 |
|
35. Implement PCRE_NEVER_UTF to lock out the use of UTF, in particular, |
136 |
|
blocking (*UTF) etc. |
137 |
|
|
138 |
|
36. In the interpreter, maximizing pattern repetitions for characters and |
139 |
|
character types now use tail recursion, which reduces stack usage. |
140 |
|
|
141 |
|
37. The value of the max lookbehind was not correctly preserved if a compiled |
142 |
|
and saved regex was reloaded on a host of different endianness. |
143 |
|
|
144 |
|
38. Implemented (*LIMIT_MATCH) and (*LIMIT_RECURSION). As part of the extension |
145 |
|
of the compiled pattern block, expand the flags field from 16 to 32 bits |
146 |
|
because it was almost full. |
147 |
|
|
148 |
28. Experimental support of (*SKIP) backtracking verb in the JIT compiler. |
39. Try madvise first before posix_madvise. |
149 |
|
|
150 |
|
|
151 |
Version 8.32 30-November-2012 |
Version 8.32 30-November-2012 |