77 |
generating tables according to the current locale when PCRE is compiled. It |
generating tables according to the current locale when PCRE is compiled. It |
78 |
turns out that in some environments, 0x85 and 0xa0, which are Unicode space |
turns out that in some environments, 0x85 and 0xa0, which are Unicode space |
79 |
characters, are recognized by isspace() and therefore were getting set in |
characters, are recognized by isspace() and therefore were getting set in |
80 |
these tables. This caused a problem in UTF-8 mode when pcre_study() was |
these tables, and indeed these tables seem to approximate to ISO 8859. This |
81 |
used to create a list of bytes that can start a match. For \s, it was |
caused a problem in UTF-8 mode when pcre_study() was used to create a list |
82 |
including 0x85 and 0xa0, which of course cannot start UTF-8 characters. I |
of bytes that can start a match. For \s, it was including 0x85 and 0xa0, |
83 |
have changed the code so that only real ASCII characters (less than 128) |
which of course cannot start UTF-8 characters. I have changed the code so |
84 |
and the correct starting bytes for UTF-8 encodings are set in this case. |
that only real ASCII characters (less than 128) and the correct starting |
85 |
(When PCRE_UCP is set - see 9 above - the code is different altogether.) |
bytes for UTF-8 encodings are set for characters greater than 127 when in |
86 |
|
UTF-8 mode. (When PCRE_UCP is set - see 9 above - the code is different |
87 |
|
altogether.) |
88 |
|
|
89 |
|
20. Added the /T option to pcretest so as to be able to run tests with non- |
90 |
|
standard character tables, thus making it possible to include the tests |
91 |
|
used for 19 above in the standard set of tests. |
92 |
|
|
93 |
|
|
94 |
Version 8.02 19-Mar-2010 |
Version 8.02 19-Mar-2010 |