/[pcre]/code/trunk/ChangeLog
ViewVC logotype

Diff of /code/trunk/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1414 by zherczeg, Sun Dec 22 16:27:35 2013 UTC revision 1689 by ph10, Wed Mar 22 15:17:45 2017 UTC
# Line 1  Line 1 
1  ChangeLog for PCRE  ChangeLog for PCRE
2  ------------------  ------------------
3    
4  Version 8.35-RC1 xx-xxxx-201x  Note that the PCRE 8.xx series (PCRE1) is now in a bugfix-only state. All
5    development is happening in the PCRE2 10.xx series.
6    
7    Version 8.41
8    ------------
9    
10    1.  Fixed typo in CMakeLists.txt (wrong number of arguments for
11    PCRE_STATIC_RUNTIME (affects MSVC only).
12    
13    2.  Issue 1 for 8.40 below was not correctly fixed. If pcregrep in multiline
14    mode with --only-matching matched several lines, it restarted scanning at the
15    next line instead of moving on to the end of the matched string, which can be
16    several lines after the start.
17    
18    3.  Fix a missing else in the JIT compiler reported by 'idaifish'.
19    
20    4.  A (?# style comment is now ignored between a basic quantifier and a
21    following '+' or '?' (example: /X+(?#comment)?Y/.
22    
23    5.  Avoid use of a potentially overflowing buffer in pcregrep (patch by Petr
24    Pisar).
25    
26    6.  Fuzzers have reported issues in pcretest. These are NOT serious (it is,
27    after all, just a test program). However, to stop the reports, some easy ones
28    are fixed:
29    
30        (a) Check for values < 256 when calling isprint() in pcretest.
31        (b) Give an error for too big a number after \O.
32    
33    7.  In the 32-bit library in non-UTF mode, an attempt to find a Unicode
34    property for a character with a code point greater than 0x10ffff (the Unicode
35    maximum) caused a crash.
36    
37    8. The alternative matching function, pcre_dfa_exec() misbehaved if it
38    encountered a character class with a possessive repeat, for example [a-f]{3}+.
39    
40    
41    Version 8.40 11-January-2017
42    ----------------------------
43    
44    1.  Using -o with -M in pcregrep could cause unnecessary repeated output when
45        the match extended over a line boundary.
46    
47    2.  Applied Chris Wilson's second patch (Bugzilla #1681) to CMakeLists.txt for
48        MSVC static compilation, putting the first patch under a new option.
49    
50    3.  Fix register overwite in JIT when SSE2 acceleration is enabled.
51    
52    4.  Ignore "show all captures" (/=) for DFA matching.
53    
54    5.  Fix JIT unaligned accesses on x86. Patch by Marc Mutz.
55    
56    6.  In any wide-character mode (8-bit UTF or any 16-bit or 32-bit mode),
57        without PCRE_UCP set, a negative character type such as \D in a positive
58        class should cause all characters greater than 255 to match, whatever else
59        is in the class. There was a bug that caused this not to happen if a
60        Unicode property item was added to such a class, for example [\D\P{Nd}] or
61        [\W\pL].
62    
63    7.  When pcretest was outputing information from a callout, the caret indicator
64        for the current position in the subject line was incorrect if it was after
65        an escape sequence for a character whose code point was greater than
66        \x{ff}.
67    
68    8.  A pattern such as (?<RA>abc)(?(R)xyz) was incorrectly compiled such that
69        the conditional was interpreted as a reference to capturing group 1 instead
70        of a test for recursion. Any group whose name began with R was
71        misinterpreted in this way. (The reference interpretation should only
72        happen if the group's name is precisely "R".)
73    
74    9.  A number of bugs have been mended relating to match start-up optimizations
75        when the first thing in a pattern is a positive lookahead. These all
76        applied only when PCRE_NO_START_OPTIMIZE was *not* set:
77    
78        (a) A pattern such as (?=.*X)X$ was incorrectly optimized as if it needed
79            both an initial 'X' and a following 'X'.
80        (b) Some patterns starting with an assertion that started with .* were
81            incorrectly optimized as having to match at the start of the subject or
82            after a newline. There are cases where this is not true, for example,
83            (?=.*[A-Z])(?=.{8,16})(?!.*[\s]) matches after the start in lines that
84            start with spaces. Starting .* in an assertion is no longer taken as an
85            indication of matching at the start (or after a newline).
86    
87    
88    Version 8.39 14-June-2016
89    -------------------------
90    
91    1.  If PCRE_AUTO_CALLOUT was set on a pattern that had a (?# comment between
92        an item and its qualifier (for example, A(?#comment)?B) pcre_compile()
93        misbehaved. This bug was found by the LLVM fuzzer.
94    
95    2.  Similar to the above, if an isolated \E was present between an item and its
96        qualifier when PCRE_AUTO_CALLOUT was set, pcre_compile() misbehaved. This
97        bug was found by the LLVM fuzzer.
98    
99    3.  Further to 8.38/46, negated classes such as [^[:^ascii:]\d] were also not
100        working correctly in UCP mode.
101    
102    4.  The POSIX wrapper function regexec() crashed if the option REG_STARTEND
103        was set when the pmatch argument was NULL. It now returns REG_INVARG.
104    
105    5.  Allow for up to 32-bit numbers in the ordin() function in pcregrep.
106    
107    6.  An empty \Q\E sequence between an item and its qualifier caused
108        pcre_compile() to misbehave when auto callouts were enabled. This bug was
109        found by the LLVM fuzzer.
110    
111    7.  If a pattern that was compiled with PCRE_EXTENDED started with white
112        space or a #-type comment that was followed by (?-x), which turns off
113        PCRE_EXTENDED, and there was no subsequent (?x) to turn it on again,
114        pcre_compile() assumed that (?-x) applied to the whole pattern and
115        consequently mis-compiled it. This bug was found by the LLVM fuzzer.
116    
117    8.  A call of pcre_copy_named_substring() for a named substring whose number
118        was greater than the space in the ovector could cause a crash.
119    
120    9.  Yet another buffer overflow bug involved duplicate named groups with a
121        group that reset capture numbers (compare 8.38/7 below). Once again, I have
122        just allowed for more memory, even if not needed. (A proper fix is
123        implemented in PCRE2, but it involves a lot of refactoring.)
124    
125    10. pcre_get_substring_list() crashed if the use of \K in a match caused the
126        start of the match to be earlier than the end.
127    
128    11. Migrating appropriate PCRE2 JIT improvements to PCRE.
129    
130    12. A pattern such as /(?<=((?C)0))/, which has a callout inside a lookbehind
131        assertion, caused pcretest to generate incorrect output, and also to read
132        uninitialized memory (detected by ASAN or valgrind).
133    
134    13. A pattern that included (*ACCEPT) in the middle of a sufficiently deeply
135        nested set of parentheses of sufficient size caused an overflow of the
136        compiling workspace (which was diagnosed, but of course is not desirable).
137    
138    14. And yet another buffer overflow bug involving duplicate named groups, this
139        time nested, with a nested back reference. Yet again, I have just allowed
140        for more memory, because anything more needs all the refactoring that has
141        been done for PCRE2. An example pattern that provoked this bug is:
142        /((?J)(?'R'(?'R'(?'R'(?'R'(?'R'(?|(\k'R'))))))))/ and the bug was
143        registered as CVE-2016-1283.
144    
145    15. pcretest went into a loop if global matching was requested with an ovector
146        size less than 2. It now gives an error message. This bug was found by
147        afl-fuzz.
148    
149    16. An invalid pattern fragment such as (?(?C)0 was not diagnosing an error
150        ("assertion expected") when (?(?C) was not followed by an opening
151        parenthesis.
152    
153    17. Fixed typo ("&&" for "&") in pcre_study(). Fortunately, this could not
154        actually affect anything, by sheer luck.
155    
156    18. Applied Chris Wilson's patch (Bugzilla #1681) to CMakeLists.txt for MSVC
157        static compilation.
158    
159    19. Modified the RunTest script to incorporate a valgrind suppressions file so
160        that certain errors, provoked by the SSE2 instruction set when JIT is used,
161        are ignored.
162    
163    20. A racing condition is fixed in JIT reported by Mozilla.
164    
165    21. Minor code refactor to avoid "array subscript is below array bounds"
166        compiler warning.
167    
168    22. Minor code refactor to avoid "left shift of negative number" warning.
169    
170    23. Fix typo causing compile error when 16- or 32-bit JIT is compiled without
171        UCP support.
172    
173    24. Refactor to avoid compiler warnings in pcrecpp.cc.
174    
175    25. Refactor to fix a typo in pcre_jit_test.c
176    
177    26. Patch to support compiling pcrecpp.cc with Intel compiler.
178    
179    
180    Version 8.38 23-November-2015
181  -----------------------------  -----------------------------
182    
183    1.  If a group that contained a recursive back reference also contained a
184        forward reference subroutine call followed by a non-forward-reference
185        subroutine call, for example /.((?2)(?R)\1)()/, pcre_compile() failed to
186        compile correct code, leading to undefined behaviour or an internally
187        detected error. This bug was discovered by the LLVM fuzzer.
188    
189    2.  Quantification of certain items (e.g. atomic back references) could cause
190        incorrect code to be compiled when recursive forward references were
191        involved. For example, in this pattern: /(?1)()((((((\1++))\x85)+)|))/.
192        This bug was discovered by the LLVM fuzzer.
193    
194    3.  A repeated conditional group whose condition was a reference by name caused
195        a buffer overflow if there was more than one group with the given name.
196        This bug was discovered by the LLVM fuzzer.
197    
198    4.  A recursive back reference by name within a group that had the same name as
199        another group caused a buffer overflow. For example:
200        /(?J)(?'d'(?'d'\g{d}))/. This bug was discovered by the LLVM fuzzer.
201    
202    5.  A forward reference by name to a group whose number is the same as the
203        current group, for example in this pattern: /(?|(\k'Pm')|(?'Pm'))/, caused
204        a buffer overflow at compile time. This bug was discovered by the LLVM
205        fuzzer.
206    
207    6.  A lookbehind assertion within a set of mutually recursive subpatterns could
208        provoke a buffer overflow. This bug was discovered by the LLVM fuzzer.
209    
210    7.  Another buffer overflow bug involved duplicate named groups with a
211        reference between their definition, with a group that reset capture
212        numbers, for example: /(?J:(?|(?'R')(\k'R')|((?'R'))))/. This has been
213        fixed by always allowing for more memory, even if not needed. (A proper fix
214        is implemented in PCRE2, but it involves more refactoring.)
215    
216    8.  There was no check for integer overflow in subroutine calls such as (?123).
217    
218    9.  The table entry for \l in EBCDIC environments was incorrect, leading to its
219        being treated as a literal 'l' instead of causing an error.
220    
221    10. There was a buffer overflow if pcre_exec() was called with an ovector of
222        size 1. This bug was found by american fuzzy lop.
223    
224    11. If a non-capturing group containing a conditional group that could match
225        an empty string was repeated, it was not identified as matching an empty
226        string itself. For example: /^(?:(?(1)x|)+)+$()/.
227    
228    12. In an EBCDIC environment, pcretest was mishandling the escape sequences
229        \a and \e in test subject lines.
230    
231    13. In an EBCDIC environment, \a in a pattern was converted to the ASCII
232        instead of the EBCDIC value.
233    
234    14. The handling of \c in an EBCDIC environment has been revised so that it is
235        now compatible with the specification in Perl's perlebcdic page.
236    
237    15. The EBCDIC character 0x41 is a non-breaking space, equivalent to 0xa0 in
238        ASCII/Unicode. This has now been added to the list of characters that are
239        recognized as white space in EBCDIC.
240    
241    16. When PCRE was compiled without UCP support, the use of \p and \P gave an
242        error (correctly) when used outside a class, but did not give an error
243        within a class.
244    
245    17. \h within a class was incorrectly compiled in EBCDIC environments.
246    
247    18. A pattern with an unmatched closing parenthesis that contained a backward
248        assertion which itself contained a forward reference caused buffer
249        overflow. And example pattern is: /(?=di(?<=(?1))|(?=(.))))/.
250    
251    19. JIT should return with error when the compiled pattern requires more stack
252        space than the maximum.
253    
254    20. A possessively repeated conditional group that could match an empty string,
255        for example, /(?(R))*+/, was incorrectly compiled.
256    
257    21. Fix infinite recursion in the JIT compiler when certain patterns such as
258        /(?:|a|){100}x/ are analysed.
259    
260    22. Some patterns with character classes involving [: and \\ were incorrectly
261        compiled and could cause reading from uninitialized memory or an incorrect
262        error diagnosis.
263    
264    23. Pathological patterns containing many nested occurrences of [: caused
265        pcre_compile() to run for a very long time.
266    
267    24. A conditional group with only one branch has an implicit empty alternative
268        branch and must therefore be treated as potentially matching an empty
269        string.
270    
271    25. If (?R was followed by - or + incorrect behaviour happened instead of a
272        diagnostic.
273    
274    26. Arrange to give up on finding the minimum matching length for overly
275        complex patterns.
276    
277    27. Similar to (4) above: in a pattern with duplicated named groups and an
278        occurrence of (?| it is possible for an apparently non-recursive back
279        reference to become recursive if a later named group with the relevant
280        number is encountered. This could lead to a buffer overflow. Wen Guanxing
281        from Venustech ADLAB discovered this bug.
282    
283    28. If pcregrep was given the -q option with -c or -l, or when handling a
284        binary file, it incorrectly wrote output to stdout.
285    
286    29. The JIT compiler did not restore the control verb head in case of *THEN
287        control verbs. This issue was found by Karl Skomski with a custom LLVM
288        fuzzer.
289    
290    30. Error messages for syntax errors following \g and \k were giving inaccurate
291        offsets in the pattern.
292    
293    31. Added a check for integer overflow in conditions (?(<digits>) and
294        (?(R<digits>). This omission was discovered by Karl Skomski with the LLVM
295        fuzzer.
296    
297    32. Handling recursive references such as (?2) when the reference is to a group
298        later in the pattern uses code that is very hacked about and error-prone.
299        It has been re-written for PCRE2. Here in PCRE1, a check has been added to
300        give an internal error if it is obvious that compiling has gone wrong.
301    
302    33. The JIT compiler should not check repeats after a {0,1} repeat byte code.
303        This issue was found by Karl Skomski with a custom LLVM fuzzer.
304    
305    34. The JIT compiler should restore the control chain for empty possessive
306        repeats. This issue was found by Karl Skomski with a custom LLVM fuzzer.
307    
308    35. Match limit check added to JIT recursion. This issue was found by Karl
309        Skomski with a custom LLVM fuzzer.
310    
311    36. Yet another case similar to 27 above has been circumvented by an
312        unconditional allocation of extra memory. This issue is fixed "properly" in
313        PCRE2 by refactoring the way references are handled. Wen Guanxing
314        from Venustech ADLAB discovered this bug.
315    
316    37. Fix two assertion fails in JIT. These issues were found by Karl Skomski
317        with a custom LLVM fuzzer.
318    
319    38. Fixed a corner case of range optimization in JIT.
320    
321    39. An incorrect error "overran compiling workspace" was given if there were
322        exactly enough group forward references such that the last one extended
323        into the workspace safety margin. The next one would have expanded the
324        workspace. The test for overflow was not including the safety margin.
325    
326    40. A match limit issue is fixed in JIT which was found by Karl Skomski
327        with a custom LLVM fuzzer.
328    
329    41. Remove the use of /dev/null in testdata/testinput2, because it doesn't
330        work under Windows. (Why has it taken so long for anyone to notice?)
331    
332    42. In a character class such as [\W\p{Any}] where both a negative-type escape
333        ("not a word character") and a property escape were present, the property
334        escape was being ignored.
335    
336    43. Fix crash caused by very long (*MARK) or (*THEN) names.
337    
338    44. A sequence such as [[:punct:]b] that is, a POSIX character class followed
339        by a single ASCII character in a class item, was incorrectly compiled in
340        UCP mode. The POSIX class got lost, but only if the single character
341        followed it.
342    
343    45. [:punct:] in UCP mode was matching some characters in the range 128-255
344        that should not have been matched.
345    
346    46. If [:^ascii:] or [:^xdigit:] or [:^cntrl:] are present in a non-negated
347        class, all characters with code points greater than 255 are in the class.
348        When a Unicode property was also in the class (if PCRE_UCP is set, escapes
349        such as \w are turned into Unicode properties), wide characters were not
350        correctly handled, and could fail to match.
351    
352    
353    Version 8.37 28-April-2015
354    --------------------------
355    
356    1.  When an (*ACCEPT) is triggered inside capturing parentheses, it arranges
357        for those parentheses to be closed with whatever has been captured so far.
358        However, it was failing to mark any other groups between the hightest
359        capture so far and the currrent group as "unset". Thus, the ovector for
360        those groups contained whatever was previously there. An example is the
361        pattern /(x)|((*ACCEPT))/ when matched against "abcd".
362    
363    2.  If an assertion condition was quantified with a minimum of zero (an odd
364        thing to do, but it happened), SIGSEGV or other misbehaviour could occur.
365    
366    3.  If a pattern in pcretest input had the P (POSIX) modifier followed by an
367        unrecognized modifier, a crash could occur.
368    
369    4.  An attempt to do global matching in pcretest with a zero-length ovector
370        caused a crash.
371    
372    5.  Fixed a memory leak during matching that could occur for a subpattern
373        subroutine call (recursive or otherwise) if the number of captured groups
374        that had to be saved was greater than ten.
375    
376    6.  Catch a bad opcode during auto-possessification after compiling a bad UTF
377        string with NO_UTF_CHECK. This is a tidyup, not a bug fix, as passing bad
378        UTF with NO_UTF_CHECK is documented as having an undefined outcome.
379    
380    7.  A UTF pattern containing a "not" match of a non-ASCII character and a
381        subroutine reference could loop at compile time. Example: /[^\xff]((?1))/.
382    
383    8. When a pattern is compiled, it remembers the highest back reference so that
384       when matching, if the ovector is too small, extra memory can be obtained to
385       use instead. A conditional subpattern whose condition is a check on a
386       capture having happened, such as, for example in the pattern
387       /^(?:(a)|b)(?(1)A|B)/, is another kind of back reference, but it was not
388       setting the highest backreference number. This mattered only if pcre_exec()
389       was called with an ovector that was too small to hold the capture, and there
390       was no other kind of back reference (a situation which is probably quite
391       rare). The effect of the bug was that the condition was always treated as
392       FALSE when the capture could not be consulted, leading to a incorrect
393       behaviour by pcre_exec(). This bug has been fixed.
394    
395    9. A reference to a duplicated named group (either a back reference or a test
396       for being set in a conditional) that occurred in a part of the pattern where
397       PCRE_DUPNAMES was not set caused the amount of memory needed for the pattern
398       to be incorrectly calculated, leading to overwriting.
399    
400    10. A mutually recursive set of back references such as (\2)(\1) caused a
401        segfault at study time (while trying to find the minimum matching length).
402        The infinite loop is now broken (with the minimum length unset, that is,
403        zero).
404    
405    11. If an assertion that was used as a condition was quantified with a minimum
406        of zero, matching went wrong. In particular, if the whole group had
407        unlimited repetition and could match an empty string, a segfault was
408        likely. The pattern (?(?=0)?)+ is an example that caused this. Perl allows
409        assertions to be quantified, but not if they are being used as conditions,
410        so the above pattern is faulted by Perl. PCRE has now been changed so that
411        it also rejects such patterns.
412    
413    12. A possessive capturing group such as (a)*+ with a minimum repeat of zero
414        failed to allow the zero-repeat case if pcre2_exec() was called with an
415        ovector too small to capture the group.
416    
417    13. Fixed two bugs in pcretest that were discovered by fuzzing and reported by
418        Red Hat Product Security:
419    
420        (a) A crash if /K and /F were both set with the option to save the compiled
421        pattern.
422    
423        (b) Another crash if the option to print captured substrings in a callout
424        was combined with setting a null ovector, for example \O\C+ as a subject
425        string.
426    
427    14. A pattern such as "((?2){0,1999}())?", which has a group containing a
428        forward reference repeated a large (but limited) number of times within a
429        repeated outer group that has a zero minimum quantifier, caused incorrect
430        code to be compiled, leading to the error "internal error:
431        previously-checked referenced subpattern not found" when an incorrect
432        memory address was read. This bug was reported as "heap overflow",
433        discovered by Kai Lu of Fortinet's FortiGuard Labs and given the CVE number
434        CVE-2015-2325.
435    
436    23. A pattern such as "((?+1)(\1))/" containing a forward reference subroutine
437        call within a group that also contained a recursive back reference caused
438        incorrect code to be compiled. This bug was reported as "heap overflow",
439        discovered by Kai Lu of Fortinet's FortiGuard Labs, and given the CVE
440        number CVE-2015-2326.
441    
442    24. Computing the size of the JIT read-only data in advance has been a source
443        of various issues, and new ones are still appear unfortunately. To fix
444        existing and future issues, size computation is eliminated from the code,
445        and replaced by on-demand memory allocation.
446    
447    25. A pattern such as /(?i)[A-`]/, where characters in the other case are
448        adjacent to the end of the range, and the range contained characters with
449        more than one other case, caused incorrect behaviour when compiled in UTF
450        mode. In that example, the range a-j was left out of the class.
451    
452    26. Fix JIT compilation of conditional blocks, which assertion
453        is converted to (*FAIL). E.g: /(?(?!))/.
454    
455    27. The pattern /(?(?!)^)/ caused references to random memory. This bug was
456        discovered by the LLVM fuzzer.
457    
458    28. The assertion (?!) is optimized to (*FAIL). This was not handled correctly
459        when this assertion was used as a condition, for example (?(?!)a|b). In
460        pcre2_match() it worked by luck; in pcre2_dfa_match() it gave an incorrect
461        error about an unsupported item.
462    
463    29. For some types of pattern, for example /Z*(|d*){216}/, the auto-
464        possessification code could take exponential time to complete. A recursion
465        depth limit of 1000 has been imposed to limit the resources used by this
466        optimization.
467    
468    30. A pattern such as /(*UTF)[\S\V\H]/, which contains a negated special class
469        such as \S in non-UCP mode, explicit wide characters (> 255) can be ignored
470        because \S ensures they are all in the class. The code for doing this was
471        interacting badly with the code for computing the amount of space needed to
472        compile the pattern, leading to a buffer overflow. This bug was discovered
473        by the LLVM fuzzer.
474    
475    31. A pattern such as /((?2)+)((?1))/ which has mutual recursion nested inside
476        other kinds of group caused stack overflow at compile time. This bug was
477        discovered by the LLVM fuzzer.
478    
479    32. A pattern such as /(?1)(?#?'){8}(a)/ which had a parenthesized comment
480        between a subroutine call and its quantifier was incorrectly compiled,
481        leading to buffer overflow or other errors. This bug was discovered by the
482        LLVM fuzzer.
483    
484    33. The illegal pattern /(?(?<E>.*!.*)?)/ was not being diagnosed as missing an
485        assertion after (?(. The code was failing to check the character after
486        (?(?< for the ! or = that would indicate a lookbehind assertion. This bug
487        was discovered by the LLVM fuzzer.
488    
489    34. A pattern such as /X((?2)()*+){2}+/ which has a possessive quantifier with
490        a fixed maximum following a group that contains a subroutine reference was
491        incorrectly compiled and could trigger buffer overflow. This bug was
492        discovered by the LLVM fuzzer.
493    
494    35. A mutual recursion within a lookbehind assertion such as (?<=((?2))((?1)))
495        caused a stack overflow instead of the diagnosis of a non-fixed length
496        lookbehind assertion. This bug was discovered by the LLVM fuzzer.
497    
498    36. The use of \K in a positive lookbehind assertion in a non-anchored pattern
499        (e.g. /(?<=\Ka)/) could make pcregrep loop.
500    
501    37. There was a similar problem to 36 in pcretest for global matches.
502    
503    38. If a greedy quantified \X was preceded by \C in UTF mode (e.g. \C\X*),
504        and a subsequent item in the pattern caused a non-match, backtracking over
505        the repeated \X did not stop, but carried on past the start of the subject,
506        causing reference to random memory and/or a segfault. There were also some
507        other cases where backtracking after \C could crash. This set of bugs was
508        discovered by the LLVM fuzzer.
509    
510    39. The function for finding the minimum length of a matching string could take
511        a very long time if mutual recursion was present many times in a pattern,
512        for example, /((?2){73}(?2))((?1))/. A better mutual recursion detection
513        method has been implemented. This infelicity was discovered by the LLVM
514        fuzzer.
515    
516    40. Static linking against the PCRE library using the pkg-config module was
517        failing on missing pthread symbols.
518    
519    
520    Version 8.36 26-September-2014
521    ------------------------------
522    
523    1.  Got rid of some compiler warnings in the C++ modules that were shown up by
524        -Wmissing-field-initializers and -Wunused-parameter.
525    
526    2.  The tests for quantifiers being too big (greater than 65535) were being
527        applied after reading the number, and stupidly assuming that integer
528        overflow would give a negative number. The tests are now applied as the
529        numbers are read.
530    
531    3.  Tidy code in pcre_exec.c where two branches that used to be different are
532        now the same.
533    
534    4.  The JIT compiler did not generate match limit checks for certain
535        bracketed expressions with quantifiers. This may lead to exponential
536        backtracking, instead of returning with PCRE_ERROR_MATCHLIMIT. This
537        issue should be resolved now.
538    
539    5.  Fixed an issue, which occures when nested alternatives are optimized
540        with table jumps.
541    
542    6.  Inserted two casts and changed some ints to size_t in the light of some
543        reported 64-bit compiler warnings (Bugzilla 1477).
544    
545    7.  Fixed a bug concerned with zero-minimum possessive groups that could match
546        an empty string, which sometimes were behaving incorrectly in the
547        interpreter (though correctly in the JIT matcher). This pcretest input is
548        an example:
549    
550          '\A(?:[^"]++|"(?:[^"]*+|"")*+")++'
551          NON QUOTED "QUOT""ED" AFTER "NOT MATCHED
552    
553        the interpreter was reporting a match of 'NON QUOTED ' only, whereas the
554        JIT matcher and Perl both matched 'NON QUOTED "QUOT""ED" AFTER '. The test
555        for an empty string was breaking the inner loop and carrying on at a lower
556        level, when possessive repeated groups should always return to a higher
557        level as they have no backtrack points in them. The empty string test now
558        occurs at the outer level.
559    
560    8.  Fixed a bug that was incorrectly auto-possessifying \w+ in the pattern
561        ^\w+(?>\s*)(?<=\w) which caused it not to match "test test".
562    
563    9.  Give a compile-time error for \o{} (as Perl does) and for \x{} (which Perl
564        doesn't).
565    
566    10. Change 8.34/15 introduced a bug that caused the amount of memory needed
567        to hold a pattern to be incorrectly computed (too small) when there were
568        named back references to duplicated names. This could cause "internal
569        error: code overflow" or "double free or corruption" or other memory
570        handling errors.
571    
572    11. When named subpatterns had the same prefixes, back references could be
573        confused. For example, in this pattern:
574    
575          /(?P<Name>a)?(?P<Name2>b)?(?(<Name>)c|d)*l/
576    
577        the reference to 'Name' was incorrectly treated as a reference to a
578        duplicate name.
579    
580    12. A pattern such as /^s?c/mi8 where the optional character has more than
581        one "other case" was incorrectly compiled such that it would only try to
582        match starting at "c".
583    
584    13. When a pattern starting with \s was studied, VT was not included in the
585        list of possible starting characters; this should have been part of the
586        8.34/18 patch.
587    
588    14. If a character class started [\Qx]... where x is any character, the class
589        was incorrectly terminated at the ].
590    
591    15. If a pattern that started with a caseless match for a character with more
592        than one "other case" was studied, PCRE did not set up the starting code
593        unit bit map for the list of possible characters. Now it does. This is an
594        optimization improvement, not a bug fix.
595    
596    16. The Unicode data tables have been updated to Unicode 7.0.0.
597    
598    17. Fixed a number of memory leaks in pcregrep.
599    
600    18. Avoid a compiler warning (from some compilers) for a function call with
601        a cast that removes "const" from an lvalue by using an intermediate
602        variable (to which the compiler does not object).
603    
604    19. Incorrect code was compiled if a group that contained an internal recursive
605        back reference was optional (had quantifier with a minimum of zero). This
606        example compiled incorrect code: /(((a\2)|(a*)\g<-1>))*/ and other examples
607        caused segmentation faults because of stack overflows at compile time.
608    
609    20. A pattern such as /((?(R)a|(?1)))+/, which contains a recursion within a
610        group that is quantified with an indefinite repeat, caused a compile-time
611        loop which used up all the system stack and provoked a segmentation fault.
612        This was not the same bug as 19 above.
613    
614    21. Add PCRECPP_EXP_DECL declaration to operator<< in pcre_stringpiece.h.
615        Patch by Mike Frysinger.
616    
617    
618    Version 8.35 04-April-2014
619    --------------------------
620    
621  1.  A new flag is set, when property checks are present in an XCLASS.  1.  A new flag is set, when property checks are present in an XCLASS.
622      When this flag is not set, PCRE can perform certain optimizations      When this flag is not set, PCRE can perform certain optimizations
623      such as studying these XCLASS-es.      such as studying these XCLASS-es.
624    
625    2.  The auto-possessification of character sets were improved: a normal
626        and an extended character set can be compared now. Furthermore
627        the JIT compiler optimizes more character set checks.
628    
629    3.  Got rid of some compiler warnings for potentially uninitialized variables
630        that show up only when compiled with -O2.
631    
632    4.  A pattern such as (?=ab\K) that uses \K in an assertion can set the start
633        of a match later then the end of the match. The pcretest program was not
634        handling the case sensibly - it was outputting from the start to the next
635        binary zero. It now reports this situation in a message, and outputs the
636        text from the end to the start.
637    
638    5.  Fast forward search is improved in JIT. Instead of the first three
639        characters, any three characters with fixed position can be searched.
640        Search order: first, last, middle.
641    
642    6.  Improve character range checks in JIT. Characters are read by an inprecise
643        function now, which returns with an unknown value if the character code is
644        above a certain threshold (e.g: 256). The only limitation is that the value
645        must be bigger than the threshold as well. This function is useful when
646        the characters above the threshold are handled in the same way.
647    
648    7.  The macros whose names start with RAWUCHAR are placeholders for a future
649        mode in which only the bottom 21 bits of 32-bit data items are used. To
650        make this more memorable for those maintaining the code, the names have
651        been changed to start with UCHAR21, and an extensive comment has been added
652        to their definition.
653    
654    8.  Add missing (new) files sljitNativeTILEGX.c and sljitNativeTILEGX-encoder.c
655        to the export list in Makefile.am (they were accidentally omitted from the
656        8.34 tarball).
657    
658    9.  The informational output from pcretest used the phrase "starting byte set"
659        which is inappropriate for the 16-bit and 32-bit libraries. As the output
660        for "first char" and "need char" really means "non-UTF-char", I've changed
661        "byte" to "char", and slightly reworded the output. The documentation about
662        these values has also been (I hope) clarified.
663    
664    10. Another JIT related optimization: use table jumps for selecting the correct
665        backtracking path, when more than four alternatives are present inside a
666        bracket.
667    
668    11. Empty match is not possible, when the minimum length is greater than zero,
669        and there is no \K in the pattern. JIT should avoid empty match checks in
670        such cases.
671    
672    12. In a caseless character class with UCP support, when a character with more
673        than one alternative case was not the first character of a range, not all
674        the alternative cases were added to the class. For example, s and \x{17f}
675        are both alternative cases for S: the class [RST] was handled correctly,
676        but [R-T] was not.
677    
678    13. The configure.ac file always checked for pthread support when JIT was
679        enabled. This is not used in Windows, so I have put this test inside a
680        check for the presence of windows.h (which was already tested for).
681    
682    14. Improve pattern prefix search by a simplified Boyer-Moore algorithm in JIT.
683        The algorithm provides a way to skip certain starting offsets, and usually
684        faster than linear prefix searches.
685    
686    15. Change 13 for 8.20 updated RunTest to check for the 'fr' locale as well
687        as for 'fr_FR' and 'french'. For some reason, however, it then used the
688        Windows-specific input and output files, which have 'french' screwed in.
689        So this could never have worked. One of the problems with locales is that
690        they aren't always the same. I have now updated RunTest so that it checks
691        the output of the locale test (test 3) against three different output
692        files, and it allows the test to pass if any one of them matches. With luck
693        this should make the test pass on some versions of Solaris where it was
694        failing. Because of the uncertainty, the script did not used to stop if
695        test 3 failed; it now does. If further versions of a French locale ever
696        come to light, they can now easily be added.
697    
698    16. If --with-pcregrep-bufsize was given a non-integer value such as "50K",
699        there was a message during ./configure, but it did not stop. This now
700        provokes an error. The invalid example in README has been corrected.
701        If a value less than the minimum is given, the minimum value has always
702        been used, but now a warning is given.
703    
704    17. If --enable-bsr-anycrlf was set, the special 16/32-bit test failed. This
705        was a bug in the test system, which is now fixed. Also, the list of various
706        configurations that are tested for each release did not have one with both
707        16/32 bits and --enable-bar-anycrlf. It now does.
708    
709    18. pcretest was missing "-C bsr" for displaying the \R default setting.
710    
711    19. Little endian PowerPC systems are supported now by the JIT compiler.
712    
713    20. The fast forward newline mechanism could enter to an infinite loop on
714        certain invalid UTF-8 input. Although we don't support these cases
715        this issue can be fixed by a performance optimization.
716    
717    21. Change 33 of 8.34 is not sufficient to ensure stack safety because it does
718        not take account if existing stack usage. There is now a new global
719        variable called pcre_stack_guard that can be set to point to an external
720        function to check stack availability. It is called at the start of
721        processing every parenthesized group.
722    
723    22. A typo in the code meant that in ungreedy mode the max/min qualifier
724        behaved like a min-possessive qualifier, and, for example, /a{1,3}b/U did
725        not match "ab".
726    
727    23. When UTF was disabled, the JIT program reported some incorrect compile
728        errors. These messages are silenced now.
729    
730    24. Experimental support for ARM-64 and MIPS-64 has been added to the JIT
731        compiler.
732    
733    25. Change all the temporary files used in RunGrepTest to be different to those
734        used by RunTest so that the tests can be run simultaneously, for example by
735        "make -j check".
736    
737    
738  Version 8.34 15-December-2013  Version 8.34 15-December-2013
739  -----------------------------  -----------------------------

Legend:
Removed from v.1414  
changed lines
  Added in v.1689

  ViewVC Help
Powered by ViewVC 1.1.5