61 |
13. In the POSIX wrapper regcomp() function, setting re_nsub field in the preg |
13. In the POSIX wrapper regcomp() function, setting re_nsub field in the preg |
62 |
structure could go wrong in environments where size_t is not the same size |
structure could go wrong in environments where size_t is not the same size |
63 |
as int. |
as int. |
64 |
|
|
65 |
|
14. Applied user-supplied patch to pcrecpp.cc to allow PCRE_NO_UTF8_CHECK to be |
66 |
|
set. |
67 |
|
|
68 |
|
15. The EBCDIC support had decayed; later updates to the code had included |
69 |
|
explicit references to (e.g.) \x0a instead of CHAR_LF. There has been a |
70 |
|
general tidy up of EBCDIC-related issues, and the documentation was also |
71 |
|
not quite right. There is now a test that can be run on ASCII systems to |
72 |
|
check some of the EBCDIC-related things (but is it not a full test). |
73 |
|
|
74 |
|
16. The new PCRE_STUDY_EXTRA_NEEDED option is now used by pcregrep, resulting |
75 |
|
in a small tidy to the code. |
76 |
|
|
77 |
|
17. Fix JIT tests when UTF is disabled and both 8 and 16 bit mode are enabled. |
78 |
|
|
79 |
|
18. If the --only-matching (-o) option in pcregrep is specified multiple |
80 |
|
times, each one causes appropriate output. For example, -o1 -o2 outputs the |
81 |
|
substrings matched by the 1st and 2nd capturing parentheses. A separating |
82 |
|
string can be specified by --om-separator (default empty). |
83 |
|
|
84 |
|
19. Improving the first n character searches. |
85 |
|
|
86 |
|
20. Turn case lists for horizontal and vertical white space into macros so that |
87 |
|
they are defined only once. |
88 |
|
|
89 |
|
21. This set of changes together give more compatible Unicode case-folding |
90 |
|
behaviour for characters that have more than one other case when UCP |
91 |
|
support is available. |
92 |
|
|
93 |
|
(a) The Unicode property table now has offsets into a new table of sets of |
94 |
|
three or more characters that are case-equivalent. The MultiStage2.py |
95 |
|
script that generates these tables (the pcre_ucd.c file) now scans |
96 |
|
CaseFolding.txt instead of UnicodeData.txt for character case |
97 |
|
information. |
98 |
|
|
99 |
|
(b) The code for adding characters or ranges of characters to a character |
100 |
|
class has been abstracted into a generalized function that also handles |
101 |
|
case-independence. In UTF-mode with UCP support, this uses the new data |
102 |
|
to handle characters with more than one other case. |
103 |
|
|
104 |
|
(c) A bug that is fixed as a result of (b) is that codepoints less than 256 |
105 |
|
whose other case is greater than 256 are now correctly matched |
106 |
|
caselessly. Previously, the high codepoint matched the low one, but not |
107 |
|
vice versa. |
108 |
|
|
109 |
|
(d) The processing of \h, \H, \v, and \ in character classes now makes use |
110 |
|
of the new class addition function, using character lists defined as |
111 |
|
macros alongside the case definitions of 20 above. |
112 |
|
|
113 |
|
(e) Caseless back references now work with characters that have more than |
114 |
|
one other case. |
115 |
|
|
116 |
|
(f) General caseless matching of characters with more than one other case |
117 |
|
is supported. |
118 |
|
|
119 |
|
22. Unicode character properties were updated from Unicode 6.2.0 |
120 |
|
|
121 |
|
23. Improved CMake support under Windows. Patch by Daniel Richard G. |
122 |
|
|
123 |
|
|
124 |
Version 8.31 06-July-2012 |
Version 8.31 06-July-2012 |