6 |
|
|
7 |
1. Replaced UCP searching code with optimized version as implemented for Ad |
1. Replaced UCP searching code with optimized version as implemented for Ad |
8 |
Muncher (http://www.admuncher.com/) by Peter Kankowski. This uses a two- |
Muncher (http://www.admuncher.com/) by Peter Kankowski. This uses a two- |
9 |
stage table and inline lookup instead of a function, giving speed ups of 2 |
stage table and inline lookup instead of a function, giving speed ups of 2 |
10 |
to 5 times on some simple patterns that I tested. Permission was given to |
to 5 times on some simple patterns that I tested. Permission was given to |
11 |
distribute the MultiStage2.py script that generates the tables (it's not in |
distribute the MultiStage2.py script that generates the tables (it's not in |
12 |
the tarball, but is in the Subversion repository). |
the tarball, but is in the Subversion repository). |
13 |
|
|
14 |
|
2. Updated the Unicode datatables to Unicode 5.1.0. This adds yet more |
15 |
|
scripts. |
16 |
|
|
17 |
|
3. Change 12 for 7.7 introduced a bug in pcre_study() when a pattern contained |
18 |
|
a group with a zero qualifier. The result of the study could be incorrect, |
19 |
|
or the function might crash, depending on the pattern. |
20 |
|
|
21 |
|
4. Caseless matching was not working for non-ASCII characters in back |
22 |
|
references. For example, /(\x{de})\1/8i was not matching \x{de}\x{fe}. |
23 |
|
It now works when Unicode Property Support is available. |
24 |
|
|
25 |
|
5. In pcretest, an escape such as \x{de} in the data was always generating |
26 |
|
a UTF-8 string, even in non-UTF-8 mode. Now it generates a single byte in |
27 |
|
non-UTF-8 mode. If the value is greater than 255, it gives a warning about |
28 |
|
truncation. |
29 |
|
|
30 |
|
6. Minor bugfix in pcrecpp.cc (change "" == ... to NULL == ...). |
31 |
|
|
32 |
|
7. Added two (int) casts to pcregrep when printing the difference of two |
33 |
|
pointers, in case they are 64-bit values. |
34 |
|
|
35 |
|
8. Added comments about Mac OS X stack usage to the pcrestack man page and to |
36 |
|
test 2 if it fails. |
37 |
|
|
38 |
|
9. Added PCRE_CALL_CONVENTION just before the names of all exported functions, |
39 |
|
and a #define of that name to empty if it is not externally set. This is to |
40 |
|
allow users of MSVC to set it if necessary. |
41 |
|
|
42 |
|
10. The PCRE_EXP_DEFN macro which precedes exported functions was missing from |
43 |
|
the convenience functions in the pcre_get.c source file. |
44 |
|
|
45 |
|
11. An option change at the start of a pattern that had top-level alternatives |
46 |
|
could cause overwriting and/or a crash. This command provoked a crash in |
47 |
|
some environments: |
48 |
|
|
49 |
|
printf "/(?i)[\xc3\xa9\xc3\xbd]|[\xc3\xa9\xc3\xbdA]/8\n" | pcretest |
50 |
|
|
51 |
|
This potential security problem was recorded as CVE-2008-2371. |
52 |
|
|
53 |
|
12. For a pattern where the match had to start at the beginning or immediately |
54 |
|
after a newline (e.g /.*anything/ without the DOTALL flag), pcre_exec() and |
55 |
|
pcre_dfa_exec() could read past the end of the passed subject if there was |
56 |
|
no match. To help with detecting such bugs (e.g. with valgrind), I modified |
57 |
|
pcretest so that it places the subject at the end of its malloc-ed buffer. |
58 |
|
|
59 |
|
13. The change to pcretest in 12 above threw up a couple more cases when pcre_ |
60 |
|
exec() might read past the end of the data buffer in UTF-8 mode. |
61 |
|
|
62 |
|
14. A similar bug to 7.3/2 existed when the PCRE_FIRSTLINE option was set and |
63 |
|
the data contained the byte 0x85 as part of a UTF-8 character within its |
64 |
|
first line. |
65 |
|
|
66 |
|
|
67 |
Version 7.7 07-May-08 |
Version 7.7 07-May-08 |