72 |
18. If the last data line in a file for pcretest does not have a newline on |
18. If the last data line in a file for pcretest does not have a newline on |
73 |
the end, a newline was missing in the output. |
the end, a newline was missing in the output. |
74 |
|
|
75 |
|
19. The default pcre_chartables.c file recognizes only ASCII characters (values |
76 |
|
less than 128) in its various bitmaps. However, there is a facility for |
77 |
|
generating tables according to the current locale when PCRE is compiled. It |
78 |
|
turns out that in some environments, 0x85 and 0xa0, which are Unicode space |
79 |
|
characters, are recognized by isspace() and therefore were getting set in |
80 |
|
these tables. This caused a problem in UTF-8 mode when pcre_study() was |
81 |
|
used to create a list of bytes that can start a match. For \s, it was |
82 |
|
including 0x85 and 0xa0, which of course cannot start UTF-8 characters. I |
83 |
|
have changed the code so that only real ASCII characters (less than 128) |
84 |
|
are set in this case because the \s etc escapes are documented as |
85 |
|
recognizing only ASCII characters. (When PCRE_UCP is set - see 9 above - |
86 |
|
the code is different altogether.) |
87 |
|
|
88 |
|
|
89 |
Version 8.02 19-Mar-2010 |
Version 8.02 19-Mar-2010 |