79 |
synonym of -m (show memory usage). I have changed it to mean "force study |
synonym of -m (show memory usage). I have changed it to mean "force study |
80 |
for every regex", that is, assume /S for every regex. This is similar to -i |
for every regex", that is, assume /S for every regex. This is similar to -i |
81 |
and -d etc. It's slightly incompatible, but I'm hoping nobody is still |
and -d etc. It's slightly incompatible, but I'm hoping nobody is still |
82 |
using it. It makes it easier to run collection of tests with study enabled, |
using it. It makes it easier to run collections of tests with and without |
83 |
and thereby test pcre_study() more easily. |
study enabled, and thereby test pcre_study() more easily. All the standard |
84 |
|
tests are now run with and without -s (but some patterns can be marked as |
85 |
|
"never study" - see 20 below). |
86 |
|
|
87 |
|
15. When (*ACCEPT) was used in a subpattern that was called recursively, the |
88 |
|
restoration of the capturing data to the outer values was not happening |
89 |
|
correctly. |
90 |
|
|
91 |
|
16. If a recursively called subpattern ended with (*ACCEPT) and matched an |
92 |
|
empty string, and PCRE_NOTEMPTY was set, pcre_exec() thought the whole |
93 |
|
pattern had matched an empty string, and so incorrectly returned a no |
94 |
|
match. |
95 |
|
|
96 |
|
17. There was optimizing code for the last branch of non-capturing parentheses, |
97 |
|
and also for the obeyed branch of a conditional subexpression, which used |
98 |
|
tail recursion to cut down on stack usage. Unfortunately, not that there is |
99 |
|
the possibility of (*THEN) occurring in these branches, tail recursion is |
100 |
|
no longer possible because the return has to be checked for (*THEN). These |
101 |
|
two optimizations have therefore been removed. |
102 |
|
|
103 |
|
18. If a pattern containing \R was studied, it was assumed that \R always |
104 |
|
matched two bytes, thus causing the minimum subject length to be |
105 |
|
incorrectly computed because \R can also match just one byte. |
106 |
|
|
107 |
|
19. If a pattern containing (*ACCEPT) was studied, the minimum subject length |
108 |
|
was incorrectly computed. |
109 |
|
|
110 |
|
20. If /S is present twice on a test pattern in pcretest input, it *disables* |
111 |
|
studying, thereby overriding the use of -s on the command line. This is |
112 |
|
necessary for one or two tests to keep the output identical in both cases. |
113 |
|
|
114 |
|
21. When (*ACCEPT) was used in an assertion that matched an empty string and |
115 |
|
PCRE_NOTEMPTY was set, PCRE applied the non-empty test to the assertion. |
116 |
|
|
117 |
|
22. When an atomic group that contained a capturing parenthesis was |
118 |
|
successfully matched, but the branch in which it appeared failed, the |
119 |
|
capturing was not being forgotten if a higher numbered group was later |
120 |
|
captured. For example, /(?>(a))b|(a)c/ when matching "ac" set capturing |
121 |
|
group 1 to "a", when in fact it should be unset. This applied to multi- |
122 |
|
branched capturing and non-capturing groups, repeated or not, and also to |
123 |
|
positive assertions (capturing in negative assertions is not well defined |
124 |
|
in PCRE) and also to nested atomic groups. |
125 |
|
|
126 |
|
23. Add the ++ qualifier feature to pcretest, to show the remainder of the |
127 |
|
subject after a captured substring (to make it easier to tell which of a |
128 |
|
number of identical substrings has been captured). |
129 |
|
|
130 |
|
24. The way atomic groups are processed by pcre_exec() has been changed so that |
131 |
|
if they are repeated, backtracking one repetition now resets captured |
132 |
|
values correctly. For example, if ((?>(a+)b)+aabab) is matched against |
133 |
|
"aaaabaaabaabab" the value of captured group 2 is now correctly recorded as |
134 |
|
"aaa". Previously, it would have been "a". As part of this code |
135 |
|
refactoring, the way recursive calls are handled has also been changed. |
136 |
|
|
137 |
|
24. If an assertion condition captured any substrings, they were not passed |
138 |
|
back unless some other capturing happened later. For example, if |
139 |
|
(?(?=(a))a) was matched against "a", no capturing was returned. |
140 |
|
|
141 |
|
25. When studying a pattern that contained subroutine calls or assertions, |
142 |
|
the code for finding the minimum length of a possible match was handling |
143 |
|
direct recursions such as (xxx(?1)|yyy) but not mutual recursions (where |
144 |
|
group 1 called group 2 while simultaneously a separate group 2 called group |
145 |
|
1). A stack overflow occurred in this case. I have fixed this by limiting |
146 |
|
the recursion depth to 10. |
147 |
|
|
148 |
|
|
149 |
Version 8.12 15-Jan-2011 |
Version 8.12 15-Jan-2011 |