1 |
ChangeLog for PCRE |
ChangeLog for PCRE |
2 |
------------------ |
------------------ |
3 |
|
|
4 |
|
Note that the PCRE 8.xx series (PCRE1) is now in a bugfix-only state. All |
5 |
|
development is happening in the PCRE2 10.xx series. |
6 |
|
|
7 |
|
Version 8.38 27-October-2015 |
8 |
|
---------------------------- |
9 |
|
|
10 |
|
1. If a group that contained a recursive back reference also contained a |
11 |
|
forward reference subroutine call followed by a non-forward-reference |
12 |
|
subroutine call, for example /.((?2)(?R)\1)()/, pcre2_compile() failed to |
13 |
|
compile correct code, leading to undefined behaviour or an internally |
14 |
|
detected error. This bug was discovered by the LLVM fuzzer. |
15 |
|
|
16 |
|
2. Quantification of certain items (e.g. atomic back references) could cause |
17 |
|
incorrect code to be compiled when recursive forward references were |
18 |
|
involved. For example, in this pattern: /(?1)()((((((\1++))\x85)+)|))/. |
19 |
|
This bug was discovered by the LLVM fuzzer. |
20 |
|
|
21 |
|
3. A repeated conditional group whose condition was a reference by name caused |
22 |
|
a buffer overflow if there was more than one group with the given name. |
23 |
|
This bug was discovered by the LLVM fuzzer. |
24 |
|
|
25 |
|
4. A recursive back reference by name within a group that had the same name as |
26 |
|
another group caused a buffer overflow. For example: |
27 |
|
/(?J)(?'d'(?'d'\g{d}))/. This bug was discovered by the LLVM fuzzer. |
28 |
|
|
29 |
|
5. A forward reference by name to a group whose number is the same as the |
30 |
|
current group, for example in this pattern: /(?|(\k'Pm')|(?'Pm'))/, caused |
31 |
|
a buffer overflow at compile time. This bug was discovered by the LLVM |
32 |
|
fuzzer. |
33 |
|
|
34 |
|
6. A lookbehind assertion within a set of mutually recursive subpatterns could |
35 |
|
provoke a buffer overflow. This bug was discovered by the LLVM fuzzer. |
36 |
|
|
37 |
|
7. Another buffer overflow bug involved duplicate named groups with a |
38 |
|
reference between their definition, with a group that reset capture |
39 |
|
numbers, for example: /(?J:(?|(?'R')(\k'R')|((?'R'))))/. This has been |
40 |
|
fixed by always allowing for more memory, even if not needed. (A proper fix |
41 |
|
is implemented in PCRE2, but it involves more refactoring.) |
42 |
|
|
43 |
|
8. There was no check for integer overflow in subroutine calls such as (?123). |
44 |
|
|
45 |
|
9. The table entry for \l in EBCDIC environments was incorrect, leading to its |
46 |
|
being treated as a literal 'l' instead of causing an error. |
47 |
|
|
48 |
|
10. There was a buffer overflow if pcre_exec() was called with an ovector of |
49 |
|
size 1. This bug was found by american fuzzy lop. |
50 |
|
|
51 |
|
11. If a non-capturing group containing a conditional group that could match |
52 |
|
an empty string was repeated, it was not identified as matching an empty |
53 |
|
string itself. For example: /^(?:(?(1)x|)+)+$()/. |
54 |
|
|
55 |
|
12. In an EBCDIC environment, pcretest was mishandling the escape sequences |
56 |
|
\a and \e in test subject lines. |
57 |
|
|
58 |
|
13. In an EBCDIC environment, \a in a pattern was converted to the ASCII |
59 |
|
instead of the EBCDIC value. |
60 |
|
|
61 |
|
14. The handling of \c in an EBCDIC environment has been revised so that it is |
62 |
|
now compatible with the specification in Perl's perlebcdic page. |
63 |
|
|
64 |
|
15. The EBCDIC character 0x41 is a non-breaking space, equivalent to 0xa0 in |
65 |
|
ASCII/Unicode. This has now been added to the list of characters that are |
66 |
|
recognized as white space in EBCDIC. |
67 |
|
|
68 |
|
16. When PCRE was compiled without UCP support, the use of \p and \P gave an |
69 |
|
error (correctly) when used outside a class, but did not give an error |
70 |
|
within a class. |
71 |
|
|
72 |
|
17. \h within a class was incorrectly compiled in EBCDIC environments. |
73 |
|
|
74 |
|
18. A pattern with an unmatched closing parenthesis that contained a backward |
75 |
|
assertion which itself contained a forward reference caused buffer |
76 |
|
overflow. And example pattern is: /(?=di(?<=(?1))|(?=(.))))/. |
77 |
|
|
78 |
|
19. JIT should return with error when the compiled pattern requires more stack |
79 |
|
space than the maximum. |
80 |
|
|
81 |
|
20. A possessively repeated conditional group that could match an empty string, |
82 |
|
for example, /(?(R))*+/, was incorrectly compiled. |
83 |
|
|
84 |
|
21. Fix infinite recursion in the JIT compiler when certain patterns such as |
85 |
|
/(?:|a|){100}x/ are analysed. |
86 |
|
|
87 |
|
22. Some patterns with character classes involving [: and \\ were incorrectly |
88 |
|
compiled and could cause reading from uninitialized memory or an incorrect |
89 |
|
error diagnosis. |
90 |
|
|
91 |
|
23. Pathological patterns containing many nested occurrences of [: caused |
92 |
|
pcre_compile() to run for a very long time. |
93 |
|
|
94 |
|
24. A conditional group with only one branch has an implicit empty alternative |
95 |
|
branch and must therefore be treated as potentially matching an empty |
96 |
|
string. |
97 |
|
|
98 |
|
25. If (?R was followed by - or + incorrect behaviour happened instead of a |
99 |
|
diagnostic. |
100 |
|
|
101 |
|
26. Arrange to give up on finding the minimum matching length for overly |
102 |
|
complex patterns. |
103 |
|
|
104 |
|
27. Similar to (4) above: in a pattern with duplicated named groups and an |
105 |
|
occurrence of (?| it is possible for an apparently non-recursive back |
106 |
|
reference to become recursive if a later named group with the relevant |
107 |
|
number is encountered. This could lead to a buffer overflow. Wen Guanxing |
108 |
|
from Venustech ADLAB discovered this bug. |
109 |
|
|
110 |
|
28. If pcregrep was given the -q option with -c or -l, or when handling a |
111 |
|
binary file, it incorrectly wrote output to stdout. |
112 |
|
|
113 |
|
29. The JIT compiler did not restore the control verb head in case of *THEN |
114 |
|
control verbs. This issue was found by Karl Skomski with a custom LLVM |
115 |
|
fuzzer. |
116 |
|
|
117 |
|
30. Error messages for syntax errors following \g and \k were giving inaccurate |
118 |
|
offsets in the pattern. |
119 |
|
|
120 |
|
31. Added a check for integer overflow in conditions (?(<digits>) and |
121 |
|
(?(R<digits>). This omission was discovered by Karl Skomski with the LLVM |
122 |
|
fuzzer. |
123 |
|
|
124 |
|
32. Handling recursive references such as (?2) when the reference is to a group |
125 |
|
later in the pattern uses code that is very hacked about and error-prone. |
126 |
|
It has been re-written for PCRE2. Here in PCRE1, a check has been added to |
127 |
|
give an internal error if it is obvious that compiling has gone wrong. |
128 |
|
|
129 |
|
33. The JIT compiler should not check repeats after a {0,1} repeat byte code. |
130 |
|
This issue was found by Karl Skomski with a custom LLVM fuzzer. |
131 |
|
|
132 |
|
34. The JIT compiler should restore the control chain for empty possessive |
133 |
|
repeats. This issue was found by Karl Skomski with a custom LLVM fuzzer. |
134 |
|
|
135 |
|
35. Match limit check added to JIT recursion. This issue was found by Karl |
136 |
|
Skomski with a custom LLVM fuzzer. |
137 |
|
|
138 |
|
36. Yet another case similar to 27 above has been circumvented by an |
139 |
|
unconditional allocation of extra memory. This issue is fixed "properly" in |
140 |
|
PCRE2 by refactoring the way references are handled. Wen Guanxing |
141 |
|
from Venustech ADLAB discovered this bug. |
142 |
|
|
143 |
|
37. Fix two assertion fails in JIT. These issues were found by Karl Skomski |
144 |
|
with a custom LLVM fuzzer. |
145 |
|
|
146 |
|
38. Fixed a corner case of range optimization in JIT. |
147 |
|
|
148 |
|
39. An incorrect error "overran compiling workspace" was given if there were |
149 |
|
exactly enough group forward references such that the last one extended |
150 |
|
into the workspace safety margin. The next one would have expanded the |
151 |
|
workspace. The test for overflow was not including the safety margin. |
152 |
|
|
153 |
|
40. A match limit issue is fixed in JIT which was found by Karl Skomski |
154 |
|
with a custom LLVM fuzzer. |
155 |
|
|
156 |
|
41. Remove the use of /dev/null in testdata/testinput2, because it doesn't |
157 |
|
work under Windows. (Why has it taken so long for anyone to notice?) |
158 |
|
|
159 |
|
42. In a character class such as [\W\p{Any}] where both a negative-type escape |
160 |
|
("not a word character") and a property escape were present, the property |
161 |
|
escape was being ignored. |
162 |
|
|
163 |
|
43. Fix crash caused by very long (*MARK) or (*THEN) names. |
164 |
|
|
165 |
|
44. A sequence such as [[:punct:]b] that is, a POSIX character class followed |
166 |
|
by a single ASCII character in a class item, was incorrectly compiled in |
167 |
|
UCP mode. The POSIX class got lost, but only if the single character |
168 |
|
followed it. |
169 |
|
|
170 |
|
45. [:punct:] in UCP mode was matching some characters in the range 128-255 |
171 |
|
that should not have been matched. |
172 |
|
|
173 |
|
46. If [:^ascii:] or [:^xdigit:] or [:^cntrl:] are present in a non-negated |
174 |
|
class, all characters with code points greater than 255 are in the class. |
175 |
|
When a Unicode property was also in the class (if PCRE_UCP is set, escapes |
176 |
|
such as \w are turned into Unicode properties), wide characters were not |
177 |
|
correctly handled, and could fail to match. |
178 |
|
|
179 |
|
|
180 |
Version 8.37 28-April-2015 |
Version 8.37 28-April-2015 |
181 |
-------------------------- |
-------------------------- |
182 |
|
|