139 |
.SH "PARTIAL MATCHING AND WORD BOUNDARIES" |
.SH "PARTIAL MATCHING AND WORD BOUNDARIES" |
140 |
.rs |
.rs |
141 |
.sp |
.sp |
142 |
If a pattern ends with one of sequences \ew or \eW, which test for word |
If a pattern ends with one of sequences \eb or \eB, which test for word |
143 |
boundaries, partial matching with PCRE_PARTIAL_SOFT can give counter-intuitive |
boundaries, partial matching with PCRE_PARTIAL_SOFT can give counter-intuitive |
144 |
results. Consider this pattern: |
results. Consider this pattern: |
145 |
.sp |
.sp |
247 |
data> The date is 23ja\eP |
data> The date is 23ja\eP |
248 |
Partial match: 23ja |
Partial match: 23ja |
249 |
.sp |
.sp |
250 |
The this stage, an application could discard the text preceding "23ja", add on |
At this stage, an application could discard the text preceding "23ja", add on |
251 |
text from the next segment, and call \fBpcre_exec()\fP again. Unlike |
text from the next segment, and call \fBpcre_exec()\fP again. Unlike |
252 |
\fBpcre_dfa_exec()\fP, the entire matching string must always be available, and |
\fBpcre_dfa_exec()\fP, the entire matching string must always be available, and |
253 |
the complete matching process occurs for each call, so more memory and more |
the complete matching process occurs for each call, so more memory and more |
337 |
1234|ABCD |
1234|ABCD |
338 |
.sp |
.sp |
339 |
where no string can be a partial match for both alternatives. This is not a |
where no string can be a partial match for both alternatives. This is not a |
340 |
problem if \fPpcre_exec()\fP is used, because the entire match has to be rerun |
problem if \fBpcre_exec()\fP is used, because the entire match has to be rerun |
341 |
each time: |
each time: |
342 |
.sp |
.sp |
343 |
re> /1234|3789/ |
re> /1234|3789/ |
347 |
0: 3789 |
0: 3789 |
348 |
.sp |
.sp |
349 |
Of course, instead of using PCRE_DFA_PARTIAL, the same technique of re-running |
Of course, instead of using PCRE_DFA_PARTIAL, the same technique of re-running |
350 |
the entire match can also be used with \fBpcre_dfa_exec()\fP. |
the entire match can also be used with \fBpcre_dfa_exec()\fP. Another |
351 |
|
possibility is to work with two buffers. If a partial match at offset \fIn\fP |
352 |
|
in the first buffer is followed by "no match" when PCRE_DFA_RESTART is used on |
353 |
|
the second buffer, you can then try a new match starting at offset \fIn+1\fP in |
354 |
|
the first buffer. |
355 |
. |
. |
356 |
. |
. |
357 |
.SH AUTHOR |
.SH AUTHOR |
368 |
.rs |
.rs |
369 |
.sp |
.sp |
370 |
.nf |
.nf |
371 |
Last updated: 18 October 2009 |
Last updated: 19 October 2009 |
372 |
Copyright (c) 1997-2009 University of Cambridge. |
Copyright (c) 1997-2009 University of Cambridge. |
373 |
.fi |
.fi |