41 |
.SH "THE STANDARD MATCHING ALGORITHM" |
.SH "THE STANDARD MATCHING ALGORITHM" |
42 |
.rs |
.rs |
43 |
.sp |
.sp |
44 |
In the terminology of Jeffrey Friedl's book \fIMastering Regular |
In the terminology of Jeffrey Friedl's book "Mastering Regular |
45 |
Expressions\fP, the standard algorithm is an "NFA algorithm". It conducts a |
Expressions", the standard algorithm is an "NFA algorithm". It conducts a |
46 |
depth-first search of the pattern tree. That is, it proceeds along a single |
depth-first search of the pattern tree. That is, it proceeds along a single |
47 |
path through the tree, checking that the subject matches what is required. When |
path through the tree, checking that the subject matches what is required. When |
48 |
there is a mismatch, the algorithm tries any alternatives at the current point, |
there is a mismatch, the algorithm tries any alternatives at the current point, |
119 |
4. For the same reason, conditional expressions that use a backreference as the |
4. For the same reason, conditional expressions that use a backreference as the |
120 |
condition or test for a specific group recursion are not supported. |
condition or test for a specific group recursion are not supported. |
121 |
.P |
.P |
122 |
5. Callouts are supported, but the value of the \fIcapture_top\fP field is |
5. Because many paths through the tree may be active, the \eK escape sequence, |
123 |
|
which resets the start of the match when encountered (but may be on some paths |
124 |
|
and not on others), is not supported. It causes an error if encountered. |
125 |
|
.P |
126 |
|
6. Callouts are supported, but the value of the \fIcapture_top\fP field is |
127 |
always 1, and the value of the \fIcapture_last\fP field is always -1. |
always 1, and the value of the \fIcapture_last\fP field is always -1. |
128 |
.P |
.P |
129 |
6. |
7. The \eC escape sequence, which (in the standard algorithm) matches a single |
|
The \eC escape sequence, which (in the standard algorithm) matches a single |
|
130 |
byte, even in UTF-8 mode, is not supported because the alternative algorithm |
byte, even in UTF-8 mode, is not supported because the alternative algorithm |
131 |
moves through the subject string one character at a time, for all active paths |
moves through the subject string one character at a time, for all active paths |
132 |
through the tree. |
through the tree. |
133 |
|
.P |
134 |
|
8. None of the backtracking control verbs such as (*PRUNE) are supported. |
135 |
. |
. |
136 |
.SH "ADVANTAGES OF THE ALTERNATIVE ALGORITHM" |
.SH "ADVANTAGES OF THE ALTERNATIVE ALGORITHM" |
137 |
.rs |
.rs |
166 |
.P |
.P |
167 |
3. Although atomic groups are supported, their use does not provide the |
3. Although atomic groups are supported, their use does not provide the |
168 |
performance advantage that it does for the standard algorithm. |
performance advantage that it does for the standard algorithm. |
169 |
.P |
. |
170 |
.in 0 |
. |
171 |
Last updated: 24 November 2006 |
.SH AUTHOR |
172 |
.br |
.rs |
173 |
Copyright (c) 1997-2006 University of Cambridge. |
.sp |
174 |
|
.nf |
175 |
|
Philip Hazel |
176 |
|
University Computing Service |
177 |
|
Cambridge CB2 3QH, England. |
178 |
|
.fi |
179 |
|
. |
180 |
|
. |
181 |
|
.SH REVISION |
182 |
|
.rs |
183 |
|
.sp |
184 |
|
.nf |
185 |
|
Last updated: 08 August 2007 |
186 |
|
Copyright (c) 1997-2007 University of Cambridge. |
187 |
|
.fi |