74 
traditional finite state machine (it keeps multiple states active 
traditional finite state machine (it keeps multiple states active 
75 
simultaneously). 
simultaneously). 
76 
.P 
.P 
77 

Although the general principle of this matching algorithm is that it scans the 
78 

subject string only once, without backtracking, there is one exception: when a 
79 

lookaround assertion is encountered, the characters following or preceding the 
80 

current point have to be independently inspected. 
81 

.P 
82 
The scan continues until either the end of the subject is reached, or there are 
The scan continues until either the end of the subject is reached, or there are 
83 
no more unterminated paths. At this point, terminated paths represent the 
no more unterminated paths. At this point, terminated paths represent the 
84 
different matching possibilities (if there are none, the match has failed). 
different matching possibilities (if there are none, the match has failed). 
85 
Thus, if there is more than one possible match, this algorithm finds all of 
Thus, if there is more than one possible match, this algorithm finds all of 
86 
them, and in particular, it finds the longest. In PCRE, there is an option to 
them, and in particular, it finds the longest. There is an option to stop the 
87 
stop the algorithm after the first match (which is necessarily the shortest) 
algorithm after the first match (which is necessarily the shortest) is found. 

has been found. 

88 
.P 
.P 
89 
Note that all the matches that are found start at the same point in the 
Note that all the matches that are found start at the same point in the 
90 
subject. If the pattern 
subject. If the pattern 
96 
character of the subject. The algorithm does not automatically move on to find 
character of the subject. The algorithm does not automatically move on to find 
97 
matches that start at later positions. 
matches that start at later positions. 
98 
.P 
.P 

Although the general principle of this matching algorithm is that it scans the 


subject string only once, without backtracking, there is one exception: when a 


lookbehind assertion is encountered, the preceding characters have to be 


reinspected. 


.P 

99 
There are a number of features of PCRE regular expressions that are not 
There are a number of features of PCRE regular expressions that are not 
100 
supported by the alternative matching algorithm. They are as follows: 
supported by the alternative matching algorithm. They are as follows: 
101 
.P 
.P 
151 
2. Because the alternative algorithm scans the subject string just once, and 
2. Because the alternative algorithm scans the subject string just once, and 
152 
never needs to backtrack, it is possible to pass very long subject strings to 
never needs to backtrack, it is possible to pass very long subject strings to 
153 
the matching function in several pieces, checking for partial matching each 
the matching function in several pieces, checking for partial matching each 
154 
time. 
time. The 
155 

.\" HREF 
156 

\fBpcrepartial\fP 
157 

.\" 
158 

documentation gives details of partial matching. 
159 

. 
160 
. 
. 
161 
.SH "DISADVANTAGES OF THE ALTERNATIVE ALGORITHM" 
.SH "DISADVANTAGES OF THE ALTERNATIVE ALGORITHM" 
162 
.rs 
.rs 
187 
.rs 
.rs 
188 
.sp 
.sp 
189 
.nf 
.nf 
190 
Last updated: 05 September 2009 
Last updated: 29 September 2009 
191 
Copyright (c) 19972009 University of Cambridge. 
Copyright (c) 19972009 University of Cambridge. 
192 
.fi 
.fi 