8 |
When this flag is not set, PCRE can perform certain optimizations |
When this flag is not set, PCRE can perform certain optimizations |
9 |
such as studying these XCLASS-es. |
such as studying these XCLASS-es. |
10 |
|
|
11 |
|
2. The auto-possessification of character sets were improved: a normal |
12 |
|
and an extended character set can be compared now. Furthermore |
13 |
|
the JIT compiler optimizes more character set checks. |
14 |
|
|
15 |
|
3. Got rid of some compiler warnings for potentially uninitialized variables |
16 |
|
that show up only when compiled with -O2. |
17 |
|
|
18 |
|
4. A pattern such as (?=ab\K) that uses \K in an assertion can set the start |
19 |
|
of a match later then the end of the match. The pcretest program was not |
20 |
|
handling the case sensibly - it was outputting from the start to the next |
21 |
|
binary zero. It now reports this situation in a message, and outputs the |
22 |
|
text from the end to the start. |
23 |
|
|
24 |
|
5. Fast forward search is improved in JIT. Instead of the first three |
25 |
|
characters, any three characters with fixed position can be searched. |
26 |
|
Search order: first, last, middle. |
27 |
|
|
28 |
|
6. Improve character range checks in JIT. Characters are read by an inprecise |
29 |
|
function now, which returns with an unknown value if the character code is |
30 |
|
above a certain treshold (e.g: 256). The only limitation is that the value |
31 |
|
must be bigger than the treshold as well. This function is useful, when |
32 |
|
the characters above the treshold are handled in the same way. |
33 |
|
|
34 |
|
7. The macros whose names start with RAWUCHAR are placeholders for a future |
35 |
|
mode in which only the bottom 21 bits of 32-bit data items are used. To |
36 |
|
make this more memorable for those maintaining the code, the names have |
37 |
|
been changed to start with UCHAR21, and an extensive comment has been added |
38 |
|
to their definition. |
39 |
|
|
40 |
|
8. Add missing (new) files sljitNativeTILEGX.c and sljitNativeTILEGX-encoder.c |
41 |
|
to the export list in Makefile.am (they were accidentally omitted from the |
42 |
|
8.34 tarball). |
43 |
|
|
44 |
|
9. The informational output from pcretest used the phrase "starting byte set" |
45 |
|
which is inappropriate for the 16-bit and 32-bit libraries. As the output |
46 |
|
for "first char" and "need char" really means "non-UTF-char", I've changed |
47 |
|
"byte" to "char", and slightly reworded the output. The documentation about |
48 |
|
these values has also been (I hope) clarified. |
49 |
|
|
50 |
|
10. Another JIT related optimization: use table jumps for selecting the correct |
51 |
|
backtracking path, when more than four alternatives are present inside a |
52 |
|
bracket. |
53 |
|
|
54 |
|
11. Empty match is not possible, when the minimum length is greater than zero, |
55 |
|
and there is no \K in the pattern. JIT should avoid empty match checks in |
56 |
|
such cases. |
57 |
|
|
58 |
|
12. In a caseless character class with UCP support, when a character with more |
59 |
|
than one alternative case was not the first character of a range, not all |
60 |
|
the alternative cases were added to the class. For example, s and \x{17f} |
61 |
|
are both alternative cases for S: the class [RST] was handled correctly, |
62 |
|
but [R-T] was not. |
63 |
|
|
64 |
|
13. The configure.ac file always checked for pthread support when JIT was |
65 |
|
enabled. This is not used in Windows, so I have put this test inside a |
66 |
|
check for the presence of windows.h (which was already tested for). |
67 |
|
|
68 |
|
14. Improve pattern prefix search by a simplified Boyer-Moore algorithm in JIT. |
69 |
|
The algorithm provides a way to skip certain starting offsets, and usually |
70 |
|
faster than linear prefix searches. |
71 |
|
|
72 |
|
15. Change 13 for 8.20 updated RunTest to check for the 'fr' locale as well |
73 |
|
as for 'fr_FR' and 'french'. For some reason, however, it then used the |
74 |
|
Windows-specific input and output files, which have 'french' screwed in. |
75 |
|
So this could never have worked. One of the problems with locales is that |
76 |
|
they aren't always the same. I have now updated RunTest so that it checks |
77 |
|
the output of the locale test (test 3) against three different output |
78 |
|
files, and it allows the test to pass if any one of them matches. With luck |
79 |
|
this should make the test pass on some versions of Solaris where it was |
80 |
|
failing. Because of the uncertainty, the script did not used to stop if |
81 |
|
test 3 failed; it now does. If further versions of a French locale ever |
82 |
|
come to light, they can now easily be added. |
83 |
|
|
84 |
|
16. If --with-pcregrep-bufsize was given a non-integer value such as "50K", |
85 |
|
there was a message during ./configure, but it did not stop. This now |
86 |
|
provokes an error. The invalid example in README has been corrected. |
87 |
|
If a value less than the minimum is given, the minimum value has always |
88 |
|
been used, but now a warning is given. |
89 |
|
|
90 |
|
17. If --enable-bsr-anycrlf was set, the special 16/32-bit test failed. This |
91 |
|
was a bug in the test system, which is now fixed. Also, the list of various |
92 |
|
configurations that are tested for each release did not have one with both |
93 |
|
16/32 bits and --enable-bar-anycrlf. It now does. |
94 |
|
|
95 |
|
18. pcretest was missing "-C bsr" for displaying the \R default setting. |
96 |
|
|
97 |
|
19. Little endian PowerPC systems are supported now by the JIT compiler. |
98 |
|
|
99 |
|
20. The fast forward newline mechanism could enter to an infinite loop on |
100 |
|
certain invalid UTF-8 input. Although we don't support these cases |
101 |
|
this issue can be fixed by a performance optimization. |
102 |
|
|
103 |
|
21. Change 33 of 8.34 is not sufficient to ensure stack safety because it does |
104 |
|
not take account if existing stack usage. There is now a new global |
105 |
|
variable called pcre_stack_guard that can be set to point to an external |
106 |
|
function to check stack availability. It is called at the start of |
107 |
|
processing every parenthesized group. |
108 |
|
|
109 |
|
|
110 |
Version 8.34 15-December-2013 |
Version 8.34 15-December-2013 |
111 |
----------------------------- |
----------------------------- |