41 |
.SH "THE STANDARD MATCHING ALGORITHM" |
.SH "THE STANDARD MATCHING ALGORITHM" |
42 |
.rs |
.rs |
43 |
.sp |
.sp |
44 |
In the terminology of Jeffrey Friedl's book \fIMastering Regular |
In the terminology of Jeffrey Friedl's book "Mastering Regular |
45 |
Expressions\fP, the standard algorithm is an "NFA algorithm". It conducts a |
Expressions", the standard algorithm is an "NFA algorithm". It conducts a |
46 |
depth-first search of the pattern tree. That is, it proceeds along a single |
depth-first search of the pattern tree. That is, it proceeds along a single |
47 |
path through the tree, checking that the subject matches what is required. When |
path through the tree, checking that the subject matches what is required. When |
48 |
there is a mismatch, the algorithm tries any alternatives at the current point, |
there is a mismatch, the algorithm tries any alternatives at the current point, |
119 |
4. For the same reason, conditional expressions that use a backreference as the |
4. For the same reason, conditional expressions that use a backreference as the |
120 |
condition or test for a specific group recursion are not supported. |
condition or test for a specific group recursion are not supported. |
121 |
.P |
.P |
122 |
5. Callouts are supported, but the value of the \fIcapture_top\fP field is |
5. Because many paths through the tree may be active, the \eK escape sequence, |
123 |
|
which resets the start of the match when encountered (but may be on some paths |
124 |
|
and not on others), is not supported. It causes an error if encountered. |
125 |
|
.P |
126 |
|
6. Callouts are supported, but the value of the \fIcapture_top\fP field is |
127 |
always 1, and the value of the \fIcapture_last\fP field is always -1. |
always 1, and the value of the \fIcapture_last\fP field is always -1. |
128 |
.P |
.P |
129 |
6. |
7. The \eC escape sequence, which (in the standard algorithm) matches a single |
|
The \eC escape sequence, which (in the standard algorithm) matches a single |
|
130 |
byte, even in UTF-8 mode, is not supported because the alternative algorithm |
byte, even in UTF-8 mode, is not supported because the alternative algorithm |
131 |
moves through the subject string one character at a time, for all active paths |
moves through the subject string one character at a time, for all active paths |
132 |
through the tree. |
through the tree. |
133 |
|
.P |
134 |
|
8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not |
135 |
|
supported. (*FAIL) is supported, and behaves like a failing negative assertion. |
136 |
. |
. |
137 |
.SH "ADVANTAGES OF THE ALTERNATIVE ALGORITHM" |
.SH "ADVANTAGES OF THE ALTERNATIVE ALGORITHM" |
138 |
.rs |
.rs |
183 |
.rs |
.rs |
184 |
.sp |
.sp |
185 |
.nf |
.nf |
186 |
Last updated: 06 March 2007 |
Last updated: 19 April 2008 |
187 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2008 University of Cambridge. |
188 |
.fi |
.fi |