1 |
ChangeLog for PCRE |
ChangeLog for PCRE |
2 |
------------------ |
------------------ |
3 |
|
|
4 |
Version 7.3 05-Jul-07 |
Version 7.3 17-Aug-07 |
5 |
--------------------- |
--------------------- |
6 |
|
|
7 |
1. In the rejigging of the build system that eventually resulted in 7.1, the |
1. In the rejigging of the build system that eventually resulted in 7.1, the |
65 |
length has been removed - we now have only the limit on the total length of |
length has been removed - we now have only the limit on the total length of |
66 |
the compiled pattern, which depends on the LINK_SIZE setting. |
the compiled pattern, which depends on the LINK_SIZE setting. |
67 |
|
|
68 |
|
10. Fixed a bug in the documentation for get/copy named substring when |
69 |
|
duplicate names are permitted. If none of the named substrings are set, the |
70 |
|
functions return PCRE_ERROR_NOSUBSTRING (7); the doc said they returned an |
71 |
|
empty string. |
72 |
|
|
73 |
|
11. Because Perl interprets \Q...\E at a high level, and ignores orphan \E |
74 |
|
instances, patterns such as [\Q\E] or [\E] or even [^\E] cause an error, |
75 |
|
because the ] is interpreted as the first data character and the |
76 |
|
terminating ] is not found. PCRE has been made compatible with Perl in this |
77 |
|
regard. Previously, it interpreted [\Q\E] as an empty class, and [\E] could |
78 |
|
cause memory overwriting. |
79 |
|
|
80 |
|
10. Like Perl, PCRE automatically breaks an unlimited repeat after an empty |
81 |
|
string has been matched (to stop an infinite loop). It was not recognizing |
82 |
|
a conditional subpattern that could match an empty string if that |
83 |
|
subpattern was within another subpattern. For example, it looped when |
84 |
|
trying to match (((?(1)X|))*) but it was OK with ((?(1)X|)*) where the |
85 |
|
condition was not nested. This bug has been fixed. |
86 |
|
|
87 |
|
12. A pattern like \X?\d or \P{L}?\d in non-UTF-8 mode could cause a backtrack |
88 |
|
past the start of the subject in the presence of bytes with the top bit |
89 |
|
set, for example "\x8aBCD". |
90 |
|
|
91 |
|
13. Added Perl 5.10 experimental backtracking controls (*FAIL), (*F), (*PRUNE), |
92 |
|
(*SKIP), (*THEN), (*COMMIT), and (*ACCEPT). |
93 |
|
|
94 |
|
14. Optimized (?!) to (*FAIL). |
95 |
|
|
96 |
|
15. Updated the test for a valid UTF-8 string to conform to the later RFC 3629. |
97 |
|
This restricts code points to be within the range 0 to 0x10FFFF, excluding |
98 |
|
the "low surrogate" sequence 0xD800 to 0xDFFF. Previously, PCRE allowed the |
99 |
|
full range 0 to 0x7FFFFFFF, as defined by RFC 2279. Internally, it still |
100 |
|
does: it's just the validity check that is more restrictive. |
101 |
|
|
102 |
|
16. Inserted checks for integer overflows during escape sequence (backslash) |
103 |
|
processing, and also fixed erroneous offset values for syntax errors during |
104 |
|
backslash processing. |
105 |
|
|
106 |
|
17. Fixed another case of looking too far back in non-UTF-8 mode (cf 12 above) |
107 |
|
for patterns like [\PPP\x8a]{1,}\x80 with the subject "A\x80". |
108 |
|
|
109 |
|
18. An unterminated class in a pattern like (?1)\c[ with a "forward reference" |
110 |
|
caused an overrun. |
111 |
|
|
112 |
|
19. A pattern like (?:[\PPa*]*){8,} which had an "extended class" (one with |
113 |
|
something other than just ASCII characters) inside a group that had an |
114 |
|
unlimited repeat caused a loop at compile time (while checking to see |
115 |
|
whether the group could match an empty string). |
116 |
|
|
117 |
|
20. Debugging a pattern containing \p or \P could cause a crash. For example, |
118 |
|
[\P{Any}] did so. (Error in the code for printing property names.) |
119 |
|
|
120 |
|
21. An orphan \E inside a character class could cause a crash. |
121 |
|
|
122 |
|
22. A repeated capturing bracket such as (A)? could cause a wild memory |
123 |
|
reference during compilation. |
124 |
|
|
125 |
|
23. There are several functions in pcre_compile() that scan along a compiled |
126 |
|
expression for various reasons (e.g. to see if it's fixed length for look |
127 |
|
behind). There were bugs in these functions when a repeated \p or \P was |
128 |
|
present in the pattern. These operators have additional parameters compared |
129 |
|
with \d, etc, and these were not being taken into account when moving along |
130 |
|
the compiled data. Specifically: |
131 |
|
|
132 |
|
(a) A item such as \p{Yi}{3} in a lookbehind was not treated as fixed |
133 |
|
length. |
134 |
|
|
135 |
|
(b) An item such as \pL+ within a repeated group could cause crashes or |
136 |
|
loops. |
137 |
|
|
138 |
|
(c) A pattern such as \p{Yi}+(\P{Yi}+)(?1) could give an incorrect |
139 |
|
"reference to non-existent subpattern" error. |
140 |
|
|
141 |
|
(d) A pattern like (\P{Yi}{2}\277)? could loop at compile time. |
142 |
|
|
143 |
|
24. A repeated \S or \W in UTF-8 mode could give wrong answers when multibyte |
144 |
|
characters were involved (for example /\S{2}/8g with "A\x{a3}BC"). |
145 |
|
|
146 |
|
25. Using pcregrep in multiline, inverted mode (-Mv) caused it to loop. |
147 |
|
|
148 |
|
|
149 |
Version 7.2 19-Jun-07 |
Version 7.2 19-Jun-07 |
150 |
--------------------- |
--------------------- |