41 
.SH "THE STANDARD MATCHING ALGORITHM" 
.SH "THE STANDARD MATCHING ALGORITHM" 
42 
.rs 
.rs 
43 
.sp 
.sp 
44 
In the terminology of Jeffrey Friedl's book \fIMastering Regular 
In the terminology of Jeffrey Friedl's book "Mastering Regular 
45 
Expressions\fP, the standard algorithm is an "NFA algorithm". It conducts a 
Expressions", the standard algorithm is an "NFA algorithm". It conducts a 
46 
depthfirst search of the pattern tree. That is, it proceeds along a single 
depthfirst search of the pattern tree. That is, it proceeds along a single 
47 
path through the tree, checking that the subject matches what is required. When 
path through the tree, checking that the subject matches what is required. When 
48 
there is a mismatch, the algorithm tries any alternatives at the current point, 
there is a mismatch, the algorithm tries any alternatives at the current point, 
119 
4. For the same reason, conditional expressions that use a backreference as the 
4. For the same reason, conditional expressions that use a backreference as the 
120 
condition or test for a specific group recursion are not supported. 
condition or test for a specific group recursion are not supported. 
121 
.P 
.P 
122 
5. Callouts are supported, but the value of the \fIcapture_top\fP field is 
5. Because many paths through the tree may be active, the \eK escape sequence, 
123 

which resets the start of the match when encountered (but may be on some paths 
124 

and not on others), is not supported. It causes an error if encountered. 
125 

.P 
126 

6. Callouts are supported, but the value of the \fIcapture_top\fP field is 
127 
always 1, and the value of the \fIcapture_last\fP field is always 1. 
always 1, and the value of the \fIcapture_last\fP field is always 1. 
128 
.P 
.P 
129 
6. 
7. The \eC escape sequence, which (in the standard algorithm) matches a single 

The \eC escape sequence, which (in the standard algorithm) matches a single 

130 
byte, even in UTF8 mode, is not supported because the alternative algorithm 
byte, even in UTF8 mode, is not supported because the alternative algorithm 
131 
moves through the subject string one character at a time, for all active paths 
moves through the subject string one character at a time, for all active paths 
132 
through the tree. 
through the tree. 
133 

.P 
134 

8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not 
135 

supported. (*FAIL) is supported, and behaves like a failing negative assertion. 
136 
. 
. 
137 
.SH "ADVANTAGES OF THE ALTERNATIVE ALGORITHM" 
.SH "ADVANTAGES OF THE ALTERNATIVE ALGORITHM" 
138 
.rs 
.rs 
144 
match using the standard algorithm, you have to do kludgy things with 
match using the standard algorithm, you have to do kludgy things with 
145 
callouts. 
callouts. 
146 
.P 
.P 
147 
2. There is much better support for partial matching. The restrictions on the 
2. Because the alternative algorithm scans the subject string just once, and 

content of the pattern that apply when using the standard algorithm for partial 


matching do not apply to the alternative algorithm. For nonanchored patterns, 


the starting position of a partial match is available. 


.P 


3. Because the alternative algorithm scans the subject string just once, and 

148 
never needs to backtrack, it is possible to pass very long subject strings to 
never needs to backtrack, it is possible to pass very long subject strings to 
149 
the matching function in several pieces, checking for partial matching each 
the matching function in several pieces, checking for partial matching each 
150 
time. 
time. 
178 
.rs 
.rs 
179 
.sp 
.sp 
180 
.nf 
.nf 
181 
Last updated: 06 March 2007 
Last updated: 25 August 2009 
182 
Copyright (c) 19972007 University of Cambridge. 
Copyright (c) 19972009 University of Cambridge. 
183 
.fi 
.fi 