77 |
Automatic callouts can be used for tracking the progress of pattern matching. |
Automatic callouts can be used for tracking the progress of pattern matching. |
78 |
The |
The |
79 |
<a href="pcretest.html"><b>pcretest</b></a> |
<a href="pcretest.html"><b>pcretest</b></a> |
80 |
command has an option that sets automatic callouts; when it is used, the output |
program has a pattern qualifier (/C) that sets automatic callouts; when it is |
81 |
indicates how the pattern is matched. This is useful information when you are |
used, the output indicates how the pattern is being matched. This is useful |
82 |
trying to optimize the performance of a particular pattern. |
information when you are trying to optimize the performance of a particular |
83 |
|
pattern. |
84 |
</P> |
</P> |
85 |
<br><a name="SEC3" href="#TOC1">MISSING CALLOUTS</a><br> |
<br><a name="SEC3" href="#TOC1">MISSING CALLOUTS</a><br> |
86 |
<P> |
<P> |
87 |
You should be aware that, because of optimizations in the way PCRE matches |
You should be aware that, because of optimizations in the way PCRE compiles and |
88 |
patterns by default, callouts sometimes do not happen. For example, if the |
matches patterns, callouts sometimes do not happen exactly as you might expect. |
89 |
pattern is |
</P> |
90 |
|
<P> |
91 |
|
At compile time, PCRE "auto-possessifies" repeated items when it knows that |
92 |
|
what follows cannot be part of the repeat. For example, a+[bc] is compiled as |
93 |
|
if it were a++[bc]. The <b>pcretest</b> output when this pattern is anchored and |
94 |
|
then applied with automatic callouts to the string "aaaa" is: |
95 |
|
<pre> |
96 |
|
--->aaaa |
97 |
|
+0 ^ ^ |
98 |
|
+1 ^ a+ |
99 |
|
+3 ^ ^ [bc] |
100 |
|
No match |
101 |
|
</pre> |
102 |
|
This indicates that when matching [bc] fails, there is no backtracking into a+ |
103 |
|
and therefore the callouts that would be taken for the backtracks do not occur. |
104 |
|
You can disable the auto-possessify feature by passing PCRE_NO_AUTO_POSSESS |
105 |
|
to <b>pcre_compile()</b>, or starting the pattern with (*NO_AUTO_POSSESS). If |
106 |
|
this is done in <b>pcretest</b> (using the /O qualifier), the output changes to |
107 |
|
this: |
108 |
|
<pre> |
109 |
|
--->aaaa |
110 |
|
+0 ^ ^ |
111 |
|
+1 ^ a+ |
112 |
|
+3 ^ ^ [bc] |
113 |
|
+3 ^ ^ [bc] |
114 |
|
+3 ^ ^ [bc] |
115 |
|
+3 ^^ [bc] |
116 |
|
No match |
117 |
|
</pre> |
118 |
|
This time, when matching [bc] fails, the matcher backtracks into a+ and tries |
119 |
|
again, repeatedly, until a+ itself fails. |
120 |
|
</P> |
121 |
|
<P> |
122 |
|
Other optimizations that provide fast "no match" results also affect callouts. |
123 |
|
For example, if the pattern is |
124 |
<pre> |
<pre> |
125 |
ab(?C4)cd |
ab(?C4)cd |
126 |
</pre> |
</pre> |
144 |
<br><a name="SEC4" href="#TOC1">THE CALLOUT INTERFACE</a><br> |
<br><a name="SEC4" href="#TOC1">THE CALLOUT INTERFACE</a><br> |
145 |
<P> |
<P> |
146 |
During matching, when PCRE reaches a callout point, the external function |
During matching, when PCRE reaches a callout point, the external function |
147 |
defined by <i>pcre_callout</i> or <i>pcre[16|32]_callout</i> is called |
defined by <i>pcre_callout</i> or <i>pcre[16|32]_callout</i> is called (if it is |
148 |
(if it is set). This applies to both normal and DFA matching. The only |
set). This applies to both normal and DFA matching. The only argument to the |
149 |
argument to the callout function is a pointer to a <b>pcre_callout</b> |
callout function is a pointer to a <b>pcre_callout</b> or |
150 |
or <b>pcre[16|32]_callout</b> block. |
<b>pcre[16|32]_callout</b> block. These structures contains the following |
151 |
These structures contains the following fields: |
fields: |
152 |
<pre> |
<pre> |
153 |
int <i>version</i>; |
int <i>version</i>; |
154 |
int <i>callout_number</i>; |
int <i>callout_number</i>; |
277 |
</P> |
</P> |
278 |
<br><a name="SEC7" href="#TOC1">REVISION</a><br> |
<br><a name="SEC7" href="#TOC1">REVISION</a><br> |
279 |
<P> |
<P> |
280 |
Last updated: 03 March 2013 |
Last updated: 12 November 2013 |
281 |
<br> |
<br> |
282 |
Copyright © 1997-2013 University of Cambridge. |
Copyright © 1997-2013 University of Cambridge. |
283 |
<br> |
<br> |