25 |
.rs |
.rs |
26 |
.TP 10 |
.TP 10 |
27 |
\fB-b\fP |
\fB-b\fP |
28 |
Behave as if each regex has the \fB/B\fP (show bytecode) modifier; the internal |
Behave as if each regex has the \fB/B\fP (show byte code) modifier; the |
29 |
form is output after compilation. |
internal form is output after compilation. |
30 |
.TP 10 |
.TP 10 |
31 |
\fB-C\fP |
\fB-C\fP |
32 |
Output the version number of the PCRE library, and all available information |
Output the version number of the PCRE library, and all available information |
49 |
Behave as if each regex has the \fB/I\fP modifier; information about the |
Behave as if each regex has the \fB/I\fP modifier; information about the |
50 |
compiled pattern is given after compilation. |
compiled pattern is given after compilation. |
51 |
.TP 10 |
.TP 10 |
52 |
|
\fB-M\fP |
53 |
|
Behave as if each data line contains the \eM escape sequence; this causes |
54 |
|
PCRE to discover the minimum MATCH_LIMIT and MATCH_LIMIT_RECURSION settings by |
55 |
|
calling \fBpcre_exec()\fP repeatedly with different limits. |
56 |
|
.TP 10 |
57 |
\fB-m\fP |
\fB-m\fP |
58 |
Output the size of each compiled pattern after it has been compiled. This is |
Output the size of each compiled pattern after it has been compiled. This is |
59 |
equivalent to adding \fB/M\fP to each regular expression. For compatibility |
equivalent to adding \fB/M\fP to each regular expression. |
|
with earlier versions of pcretest, \fB-s\fP is a synonym for \fB-m\fP. |
|
60 |
.TP 10 |
.TP 10 |
61 |
\fB-o\fP \fIosize\fP |
\fB-o\fP \fIosize\fP |
62 |
Set the number of elements in the output vector that is used when calling |
Set the number of elements in the output vector that is used when calling |
75 |
Do not output the version number of \fBpcretest\fP at the start of execution. |
Do not output the version number of \fBpcretest\fP at the start of execution. |
76 |
.TP 10 |
.TP 10 |
77 |
\fB-S\fP \fIsize\fP |
\fB-S\fP \fIsize\fP |
78 |
On Unix-like systems, set the size of the runtime stack to \fIsize\fP |
On Unix-like systems, set the size of the run-time stack to \fIsize\fP |
79 |
megabytes. |
megabytes. |
80 |
.TP 10 |
.TP 10 |
81 |
|
\fB-s\fP |
82 |
|
Behave as if each regex has the \fB/S\fP modifier; in other words, force each |
83 |
|
regex to be studied. |
84 |
|
.TP 10 |
85 |
\fB-t\fP |
\fB-t\fP |
86 |
Run each compile, study, and match many times with a timer, and output |
Run each compile, study, and match many times with a timer, and output |
87 |
resulting time per compile or match (in milliseconds). Do not set \fB-m\fP with |
resulting time per compile or match (in milliseconds). Do not set \fB-m\fP with |
157 |
A pattern may be followed by any number of modifiers, which are mostly single |
A pattern may be followed by any number of modifiers, which are mostly single |
158 |
characters. Following Perl usage, these are referred to below as, for example, |
characters. Following Perl usage, these are referred to below as, for example, |
159 |
"the \fB/i\fP modifier", even though the delimiter of the pattern need not |
"the \fB/i\fP modifier", even though the delimiter of the pattern need not |
160 |
always be a slash, and no slash is used when writing modifiers. Whitespace may |
always be a slash, and no slash is used when writing modifiers. White space may |
161 |
appear between the final pattern delimiter and the first modifier, and between |
appear between the final pattern delimiter and the first modifier, and between |
162 |
the modifiers themselves. |
the modifiers themselves. |
163 |
.P |
.P |
168 |
.sp |
.sp |
169 |
/caseless/i |
/caseless/i |
170 |
.sp |
.sp |
171 |
The following table shows additional modifiers for setting PCRE options that do |
The following table shows additional modifiers for setting PCRE compile-time |
172 |
not correspond to anything in Perl: |
options that do not correspond to anything in Perl: |
173 |
.sp |
.sp |
174 |
|
\fB/8\fP PCRE_UTF8 |
175 |
|
\fB/?\fP PCRE_NO_UTF8_CHECK |
176 |
\fB/A\fP PCRE_ANCHORED |
\fB/A\fP PCRE_ANCHORED |
177 |
\fB/C\fP PCRE_AUTO_CALLOUT |
\fB/C\fP PCRE_AUTO_CALLOUT |
178 |
\fB/E\fP PCRE_DOLLAR_ENDONLY |
\fB/E\fP PCRE_DOLLAR_ENDONLY |
180 |
\fB/J\fP PCRE_DUPNAMES |
\fB/J\fP PCRE_DUPNAMES |
181 |
\fB/N\fP PCRE_NO_AUTO_CAPTURE |
\fB/N\fP PCRE_NO_AUTO_CAPTURE |
182 |
\fB/U\fP PCRE_UNGREEDY |
\fB/U\fP PCRE_UNGREEDY |
183 |
|
\fB/W\fP PCRE_UCP |
184 |
\fB/X\fP PCRE_EXTRA |
\fB/X\fP PCRE_EXTRA |
185 |
|
\fB/Y\fP PCRE_NO_START_OPTIMIZE |
186 |
|
\fB/<JS>\fP PCRE_JAVASCRIPT_COMPAT |
187 |
\fB/<cr>\fP PCRE_NEWLINE_CR |
\fB/<cr>\fP PCRE_NEWLINE_CR |
188 |
\fB/<lf>\fP PCRE_NEWLINE_LF |
\fB/<lf>\fP PCRE_NEWLINE_LF |
189 |
\fB/<crlf>\fP PCRE_NEWLINE_CRLF |
\fB/<crlf>\fP PCRE_NEWLINE_CRLF |
192 |
\fB/<bsr_anycrlf>\fP PCRE_BSR_ANYCRLF |
\fB/<bsr_anycrlf>\fP PCRE_BSR_ANYCRLF |
193 |
\fB/<bsr_unicode>\fP PCRE_BSR_UNICODE |
\fB/<bsr_unicode>\fP PCRE_BSR_UNICODE |
194 |
.sp |
.sp |
195 |
Those specifying line ending sequences are literal strings as shown, but the |
The modifiers that are enclosed in angle brackets are literal strings as shown, |
196 |
letters can be in either case. This example sets multiline matching with CRLF |
including the angle brackets, but the letters can be in either case. This |
197 |
as the line ending sequence: |
example sets multiline matching with CRLF as the line ending sequence: |
198 |
.sp |
.sp |
199 |
/^abc/m<crlf> |
/^abc/m<crlf> |
200 |
.sp |
.sp |
201 |
Details of the meanings of these PCRE options are given in the |
As well as turning on the PCRE_UTF8 option, the \fB/8\fP modifier also causes |
202 |
|
any non-printing characters in output strings to be printed using the |
203 |
|
\ex{hh...} notation if they are valid UTF-8 sequences. Full details of the PCRE |
204 |
|
options are given in the |
205 |
.\" HREF |
.\" HREF |
206 |
\fBpcreapi\fP |
\fBpcreapi\fP |
207 |
.\" |
.\" |
221 |
begins with a lookbehind assertion (including \eb or \eB). |
begins with a lookbehind assertion (including \eb or \eB). |
222 |
.P |
.P |
223 |
If any call to \fBpcre_exec()\fP in a \fB/g\fP or \fB/G\fP sequence matches an |
If any call to \fBpcre_exec()\fP in a \fB/g\fP or \fB/G\fP sequence matches an |
224 |
empty string, the next call is done with the PCRE_NOTEMPTY and PCRE_ANCHORED |
empty string, the next call is done with the PCRE_NOTEMPTY_ATSTART and |
225 |
flags set in order to search for another, non-empty, match at the same point. |
PCRE_ANCHORED flags set in order to search for another, non-empty, match at the |
226 |
If this second match fails, the start offset is advanced by one, and the normal |
same point. If this second match fails, the start offset is advanced, and the |
227 |
match is retried. This imitates the way Perl handles such cases when using the |
normal match is retried. This imitates the way Perl handles such cases when |
228 |
\fB/g\fP modifier or the \fBsplit()\fP function. |
using the \fB/g\fP modifier or the \fBsplit()\fP function. Normally, the start |
229 |
|
offset is advanced by one character, but if the newline convention recognizes |
230 |
|
CRLF as a newline, and the current character is CR followed by LF, an advance |
231 |
|
of two is used. |
232 |
. |
. |
233 |
. |
. |
234 |
.SS "Other modifiers" |
.SS "Other modifiers" |
249 |
use in the automatic test scripts; it ensures that the same output is generated |
use in the automatic test scripts; it ensures that the same output is generated |
250 |
for different internal link sizes. |
for different internal link sizes. |
251 |
.P |
.P |
|
The \fB/L\fP modifier must be followed directly by the name of a locale, for |
|
|
example, |
|
|
.sp |
|
|
/pattern/Lfr_FR |
|
|
.sp |
|
|
For this reason, it must be the last modifier. The given locale is set, |
|
|
\fBpcre_maketables()\fP is called to build a set of character tables for the |
|
|
locale, and this is then passed to \fBpcre_compile()\fP when compiling the |
|
|
regular expression. Without an \fB/L\fP modifier, NULL is passed as the tables |
|
|
pointer; that is, \fB/L\fP applies only to the expression on which it appears. |
|
|
.P |
|
|
The \fB/I\fP modifier requests that \fBpcretest\fP output information about the |
|
|
compiled pattern (whether it is anchored, has a fixed first character, and |
|
|
so on). It does this by calling \fBpcre_fullinfo()\fP after compiling a |
|
|
pattern. If the pattern is studied, the results of that are also output. |
|
|
.P |
|
252 |
The \fB/D\fP modifier is a PCRE debugging feature, and is equivalent to |
The \fB/D\fP modifier is a PCRE debugging feature, and is equivalent to |
253 |
\fB/BI\fP, that is, both the \fB/B\fP and the \fB/I\fP modifiers. |
\fB/BI\fP, that is, both the \fB/B\fP and the \fB/I\fP modifiers. |
254 |
.P |
.P |
260 |
\fB/P\fP pattern modifier is specified. See also the section about saving and |
\fB/P\fP pattern modifier is specified. See also the section about saving and |
261 |
reloading compiled patterns below. |
reloading compiled patterns below. |
262 |
.P |
.P |
263 |
The \fB/S\fP modifier causes \fBpcre_study()\fP to be called after the |
The \fB/I\fP modifier requests that \fBpcretest\fP output information about the |
264 |
expression has been compiled, and the results used when the expression is |
compiled pattern (whether it is anchored, has a fixed first character, and |
265 |
matched. |
so on). It does this by calling \fBpcre_fullinfo()\fP after compiling a |
266 |
|
pattern. If the pattern is studied, the results of that are also output. |
267 |
|
.P |
268 |
|
The \fB/K\fP modifier requests \fBpcretest\fP to show names from backtracking |
269 |
|
control verbs that are returned from calls to \fBpcre_exec()\fP. It causes |
270 |
|
\fBpcretest\fP to create a \fBpcre_extra\fP block if one has not already been |
271 |
|
created by a call to \fBpcre_study()\fP, and to set the PCRE_EXTRA_MARK flag |
272 |
|
and the \fBmark\fP field within it, every time that \fBpcre_exec()\fP is |
273 |
|
called. If the variable that the \fBmark\fP field points to is non-NULL for a |
274 |
|
match, non-match, or partial match, \fBpcretest\fP prints the string to which |
275 |
|
it points. For a match, this is shown on a line by itself, tagged with "MK:". |
276 |
|
For a non-match it is added to the message. |
277 |
|
.P |
278 |
|
The \fB/L\fP modifier must be followed directly by the name of a locale, for |
279 |
|
example, |
280 |
|
.sp |
281 |
|
/pattern/Lfr_FR |
282 |
|
.sp |
283 |
|
For this reason, it must be the last modifier. The given locale is set, |
284 |
|
\fBpcre_maketables()\fP is called to build a set of character tables for the |
285 |
|
locale, and this is then passed to \fBpcre_compile()\fP when compiling the |
286 |
|
regular expression. Without an \fB/L\fP (or \fB/T\fP) modifier, NULL is passed |
287 |
|
as the tables pointer; that is, \fB/L\fP applies only to the expression on |
288 |
|
which it appears. |
289 |
.P |
.P |
290 |
The \fB/M\fP modifier causes the size of memory block used to hold the compiled |
The \fB/M\fP modifier causes the size of memory block used to hold the compiled |
291 |
pattern to be output. |
pattern to be output. |
292 |
.P |
.P |
293 |
|
The \fB/S\fP modifier causes \fBpcre_study()\fP to be called after the |
294 |
|
expression has been compiled, and the results used when the expression is |
295 |
|
matched. |
296 |
|
.P |
297 |
|
The \fB/T\fP modifier must be followed by a single digit. It causes a specific |
298 |
|
set of built-in character tables to be passed to \fBpcre_compile()\fP. It is |
299 |
|
used in the standard PCRE tests to check behaviour with different character |
300 |
|
tables. The digit specifies the tables as follows: |
301 |
|
.sp |
302 |
|
0 the default ASCII tables, as distributed in |
303 |
|
pcre_chartables.c.dist |
304 |
|
1 a set of tables defining ISO 8859 characters |
305 |
|
.sp |
306 |
|
In table 1, some characters whose codes are greater than 128 are identified as |
307 |
|
letters, digits, spaces, etc. |
308 |
|
. |
309 |
|
. |
310 |
|
.SS "Using the POSIX wrapper API" |
311 |
|
.rs |
312 |
|
.sp |
313 |
The \fB/P\fP modifier causes \fBpcretest\fP to call PCRE via the POSIX wrapper |
The \fB/P\fP modifier causes \fBpcretest\fP to call PCRE via the POSIX wrapper |
314 |
API rather than its native API. When this is done, all other modifiers except |
API rather than its native API. When \fB/P\fP is set, the following modifiers |
315 |
\fB/i\fP, \fB/m\fP, and \fB/+\fP are ignored. REG_ICASE is set if \fB/i\fP is |
set options for the \fBregcomp()\fP function: |
316 |
present, and REG_NEWLINE is set if \fB/m\fP is present. The wrapper functions |
.sp |
317 |
force PCRE_DOLLAR_ENDONLY always, and PCRE_DOTALL unless REG_NEWLINE is set. |
/i REG_ICASE |
318 |
.P |
/m REG_NEWLINE |
319 |
The \fB/8\fP modifier causes \fBpcretest\fP to call PCRE with the PCRE_UTF8 |
/N REG_NOSUB |
320 |
option set. This turns on support for UTF-8 character handling in PCRE, |
/s REG_DOTALL ) |
321 |
provided that it was compiled with this support enabled. This modifier also |
/U REG_UNGREEDY ) These options are not part of |
322 |
causes any non-printing characters in output strings to be printed using the |
/W REG_UCP ) the POSIX standard |
323 |
\ex{hh...} notation if they are valid UTF-8 sequences. |
/8 REG_UTF8 ) |
324 |
.P |
.sp |
325 |
If the \fB/?\fP modifier is used with \fB/8\fP, it causes \fBpcretest\fP to |
The \fB/+\fP modifier works as described above. All other modifiers are |
326 |
call \fBpcre_compile()\fP with the PCRE_NO_UTF8_CHECK option, to suppress the |
ignored. |
|
checking of the string for UTF-8 validity. |
|
327 |
. |
. |
328 |
. |
. |
329 |
.SH "DATA LINES" |
.SH "DATA LINES" |
330 |
.rs |
.rs |
331 |
.sp |
.sp |
332 |
Before each data line is passed to \fBpcre_exec()\fP, leading and trailing |
Before each data line is passed to \fBpcre_exec()\fP, leading and trailing |
333 |
whitespace is removed, and it is then scanned for \e escapes. Some of these are |
white space is removed, and it is then scanned for \e escapes. Some of these |
334 |
pretty esoteric features, intended for checking out some of the more |
are pretty esoteric features, intended for checking out some of the more |
335 |
complicated features of PCRE. If you are just testing "ordinary" regular |
complicated features of PCRE. If you are just testing "ordinary" regular |
336 |
expressions, you probably don't need any of these. The following escapes are |
expressions, you probably don't need any of these. The following escapes are |
337 |
recognized: |
recognized: |
339 |
\ea alarm (BEL, \ex07) |
\ea alarm (BEL, \ex07) |
340 |
\eb backspace (\ex08) |
\eb backspace (\ex08) |
341 |
\ee escape (\ex27) |
\ee escape (\ex27) |
342 |
\ef formfeed (\ex0c) |
\ef form feed (\ex0c) |
343 |
\en newline (\ex0a) |
\en newline (\ex0a) |
344 |
.\" JOIN |
.\" JOIN |
345 |
\eqdd set the PCRE_MATCH_LIMIT limit to dd |
\eqdd set the PCRE_MATCH_LIMIT limit to dd |
348 |
\et tab (\ex09) |
\et tab (\ex09) |
349 |
\ev vertical tab (\ex0b) |
\ev vertical tab (\ex0b) |
350 |
\ennn octal character (up to 3 octal digits) |
\ennn octal character (up to 3 octal digits) |
351 |
\exhh hexadecimal character (up to 2 hex digits) |
always a byte unless > 255 in UTF-8 mode |
352 |
|
\exhh hexadecimal byte (up to 2 hex digits) |
353 |
.\" JOIN |
.\" JOIN |
354 |
\ex{hh...} hexadecimal character, any number of digits |
\ex{hh...} hexadecimal character, any number of digits |
355 |
in UTF-8 mode |
in UTF-8 mode |
396 |
MATCH_LIMIT_RECURSION settings |
MATCH_LIMIT_RECURSION settings |
397 |
.\" JOIN |
.\" JOIN |
398 |
\eN pass the PCRE_NOTEMPTY option to \fBpcre_exec()\fP |
\eN pass the PCRE_NOTEMPTY option to \fBpcre_exec()\fP |
399 |
or \fBpcre_dfa_exec()\fP |
or \fBpcre_dfa_exec()\fP; if used twice, pass the |
400 |
|
PCRE_NOTEMPTY_ATSTART option |
401 |
.\" JOIN |
.\" JOIN |
402 |
\eOdd set the size of the output vector passed to |
\eOdd set the size of the output vector passed to |
403 |
\fBpcre_exec()\fP to dd (any number of digits) |
\fBpcre_exec()\fP to dd (any number of digits) |
404 |
.\" JOIN |
.\" JOIN |
405 |
\eP pass the PCRE_PARTIAL option to \fBpcre_exec()\fP |
\eP pass the PCRE_PARTIAL_SOFT option to \fBpcre_exec()\fP |
406 |
or \fBpcre_dfa_exec()\fP |
or \fBpcre_dfa_exec()\fP; if used twice, pass the |
407 |
|
PCRE_PARTIAL_HARD option |
408 |
.\" JOIN |
.\" JOIN |
409 |
\eQdd set the PCRE_MATCH_LIMIT_RECURSION limit to dd |
\eQdd set the PCRE_MATCH_LIMIT_RECURSION limit to dd |
410 |
(any number of digits) |
(any number of digits) |
411 |
\eR pass the PCRE_DFA_RESTART option to \fBpcre_dfa_exec()\fP |
\eR pass the PCRE_DFA_RESTART option to \fBpcre_dfa_exec()\fP |
412 |
\eS output details of memory get/free calls during matching |
\eS output details of memory get/free calls during matching |
413 |
.\" JOIN |
.\" JOIN |
414 |
|
\eY pass the PCRE_NO_START_OPTIMIZE option to \fBpcre_exec()\fP |
415 |
|
or \fBpcre_dfa_exec()\fP |
416 |
|
.\" JOIN |
417 |
\eZ pass the PCRE_NOTEOL option to \fBpcre_exec()\fP |
\eZ pass the PCRE_NOTEOL option to \fBpcre_exec()\fP |
418 |
or \fBpcre_dfa_exec()\fP |
or \fBpcre_dfa_exec()\fP |
419 |
.\" JOIN |
.\" JOIN |
420 |
\e? pass the PCRE_NO_UTF8_CHECK option to |
\e? pass the PCRE_NO_UTF8_CHECK option to |
421 |
\fBpcre_exec()\fP or \fBpcre_dfa_exec()\fP |
\fBpcre_exec()\fP or \fBpcre_dfa_exec()\fP |
|
\e>dd start the match at offset dd (any number of digits); |
|
422 |
.\" JOIN |
.\" JOIN |
423 |
this sets the \fIstartoffset\fP argument for \fBpcre_exec()\fP |
\e>dd start the match at offset dd (optional "-"; then |
424 |
or \fBpcre_dfa_exec()\fP |
any number of digits); this sets the \fIstartoffset\fP |
425 |
|
argument for \fBpcre_exec()\fP or \fBpcre_dfa_exec()\fP |
426 |
.\" JOIN |
.\" JOIN |
427 |
\e<cr> pass the PCRE_NEWLINE_CR option to \fBpcre_exec()\fP |
\e<cr> pass the PCRE_NEWLINE_CR option to \fBpcre_exec()\fP |
428 |
or \fBpcre_dfa_exec()\fP |
or \fBpcre_dfa_exec()\fP |
439 |
\e<any> pass the PCRE_NEWLINE_ANY option to \fBpcre_exec()\fP |
\e<any> pass the PCRE_NEWLINE_ANY option to \fBpcre_exec()\fP |
440 |
or \fBpcre_dfa_exec()\fP |
or \fBpcre_dfa_exec()\fP |
441 |
.sp |
.sp |
442 |
|
Note that \exhh always specifies one byte, even in UTF-8 mode; this makes it |
443 |
|
possible to construct invalid UTF-8 sequences for testing purposes. On the |
444 |
|
other hand, \ex{hh} is interpreted as a UTF-8 character in UTF-8 mode, |
445 |
|
generating more than one byte if the value is greater than 127. When not in |
446 |
|
UTF-8 mode, it generates one byte for values less than 256, and causes an error |
447 |
|
for greater values. |
448 |
|
.P |
449 |
The escapes that specify line ending sequences are literal strings, exactly as |
The escapes that specify line ending sequences are literal strings, exactly as |
450 |
shown. No more than one newline setting should be present in any data line. |
shown. No more than one newline setting should be present in any data line. |
451 |
.P |
.P |
471 |
the call of \fBpcre_exec()\fP for the line in which it appears. |
the call of \fBpcre_exec()\fP for the line in which it appears. |
472 |
.P |
.P |
473 |
If the \fB/P\fP modifier was present on the pattern, causing the POSIX wrapper |
If the \fB/P\fP modifier was present on the pattern, causing the POSIX wrapper |
474 |
API to be used, the only option-setting sequences that have any effect are \eB |
API to be used, the only option-setting sequences that have any effect are \eB, |
475 |
and \eZ, causing REG_NOTBOL and REG_NOTEOL, respectively, to be passed to |
\eN, and \eZ, causing REG_NOTBOL, REG_NOTEMPTY, and REG_NOTEOL, respectively, |
476 |
\fBregexec()\fP. |
to be passed to \fBregexec()\fP. |
477 |
.P |
.P |
478 |
The use of \ex{hh...} to represent UTF-8 characters is not dependent on the use |
The use of \ex{hh...} to represent UTF-8 characters is not dependent on the use |
479 |
of the \fB/8\fP modifier on the pattern. It is recognized always. There may be |
of the \fB/8\fP modifier on the pattern. It is recognized always. There may be |
510 |
This section describes the output when the normal matching function, |
This section describes the output when the normal matching function, |
511 |
\fBpcre_exec()\fP, is being used. |
\fBpcre_exec()\fP, is being used. |
512 |
.P |
.P |
513 |
When a match succeeds, pcretest outputs the list of captured substrings that |
When a match succeeds, \fBpcretest\fP outputs the list of captured substrings |
514 |
\fBpcre_exec()\fP returns, starting with number 0 for the string that matched |
that \fBpcre_exec()\fP returns, starting with number 0 for the string that |
515 |
the whole pattern. Otherwise, it outputs "No match" or "Partial match" |
matched the whole pattern. Otherwise, it outputs "No match" when the return is |
516 |
when \fBpcre_exec()\fP returns PCRE_ERROR_NOMATCH or PCRE_ERROR_PARTIAL, |
PCRE_ERROR_NOMATCH, and "Partial match:" followed by the partially matching |
517 |
respectively, and otherwise the PCRE negative error number. Here is an example |
substring when \fBpcre_exec()\fP returns PCRE_ERROR_PARTIAL. (Note that this is |
518 |
of an interactive \fBpcretest\fP run. |
the entire substring that was inspected during the partial match; it may |
519 |
|
include characters before the actual match start if a lookbehind assertion, |
520 |
|
\eK, \eb, or \eB was involved.) For any other return, \fBpcretest\fP outputs |
521 |
|
the PCRE negative error number and a short descriptive phrase. If the error is |
522 |
|
a failed UTF-8 string check, the byte offset of the start of the failing |
523 |
|
character and the reason code are also output, provided that the size of the |
524 |
|
output vector is at least two. Here is an example of an interactive |
525 |
|
\fBpcretest\fP run. |
526 |
.sp |
.sp |
527 |
$ pcretest |
$ pcretest |
528 |
PCRE version 7.0 30-Nov-2006 |
PCRE version 8.13 2011-04-30 |
529 |
.sp |
.sp |
530 |
re> /^abc(\ed+)/ |
re> /^abc(\ed+)/ |
531 |
data> abc123 |
data> abc123 |
534 |
data> xyz |
data> xyz |
535 |
No match |
No match |
536 |
.sp |
.sp |
537 |
Note that unset capturing substrings that are not followed by one that is set |
Unset capturing substrings that are not followed by one that is set are not |
538 |
are not returned by \fBpcre_exec()\fP, and are not shown by \fBpcretest\fP. In |
returned by \fBpcre_exec()\fP, and are not shown by \fBpcretest\fP. In the |
539 |
the following example, there are two capturing substrings, but when the first |
following example, there are two capturing substrings, but when the first data |
540 |
data line is matched, the second, unset substring is not shown. An "internal" |
line is matched, the second, unset substring is not shown. An "internal" unset |
541 |
unset substring is shown as "<unset>", as for the second data line. |
substring is shown as "<unset>", as for the second data line. |
542 |
.sp |
.sp |
543 |
re> /(a)|(b)/ |
re> /(a)|(b)/ |
544 |
data> a |
data> a |
572 |
0: ipp |
0: ipp |
573 |
1: pp |
1: pp |
574 |
.sp |
.sp |
575 |
"No match" is output only if the first match attempt fails. |
"No match" is output only if the first match attempt fails. Here is an example |
576 |
|
of a failure message (the offset 4 that is specified by \e>4 is past the end of |
577 |
|
the subject string): |
578 |
|
.sp |
579 |
|
re> /xyz/ |
580 |
|
data> xyz\>4 |
581 |
|
Error -24 (bad offset value) |
582 |
.P |
.P |
583 |
If any of the sequences \fB\eC\fP, \fB\eG\fP, or \fB\eL\fP are present in a |
If any of the sequences \fB\eC\fP, \fB\eG\fP, or \fB\eL\fP are present in a |
584 |
data line that is successfully matched, the substrings extracted by the |
data line that is successfully matched, the substrings extracted by the |
609 |
2: tan |
2: tan |
610 |
.sp |
.sp |
611 |
(Using the normal matching function on this data finds only "tang".) The |
(Using the normal matching function on this data finds only "tang".) The |
612 |
longest matching string is always given first (and numbered zero). |
longest matching string is always given first (and numbered zero). After a |
613 |
|
PCRE_ERROR_PARTIAL return, the output is "Partial match:", followed by the |
614 |
|
partially matching substring. (Note that this is the entire substring that was |
615 |
|
inspected during the partial match; it may include characters before the actual |
616 |
|
match start if a lookbehind assertion, \eK, \eb, or \eB was involved.) |
617 |
.P |
.P |
618 |
If \fB/g\fP is present on the pattern, the search for further matches resumes |
If \fB/g\fP is present on the pattern, the search for further matches resumes |
619 |
at the end of the longest match. For example: |
at the end of the longest match. For example: |
715 |
.rs |
.rs |
716 |
.sp |
.sp |
717 |
The facilities described in this section are not available when the POSIX |
The facilities described in this section are not available when the POSIX |
718 |
inteface to PCRE is being used, that is, when the \fB/P\fP pattern modifier is |
interface to PCRE is being used, that is, when the \fB/P\fP pattern modifier is |
719 |
specified. |
specified. |
720 |
.P |
.P |
721 |
When the POSIX interface is not in use, you can cause \fBpcretest\fP to write a |
When the POSIX interface is not in use, you can cause \fBpcretest\fP to write a |
739 |
follows immediately after the compiled pattern. After writing the file, |
follows immediately after the compiled pattern. After writing the file, |
740 |
\fBpcretest\fP expects to read a new pattern. |
\fBpcretest\fP expects to read a new pattern. |
741 |
.P |
.P |
742 |
A saved pattern can be reloaded into \fBpcretest\fP by specifing < and a file |
A saved pattern can be reloaded into \fBpcretest\fP by specifying < and a file |
743 |
name instead of a pattern. The name of the file must not contain a < character, |
name instead of a pattern. The name of the file must not contain a < character, |
744 |
as otherwise \fBpcretest\fP will interpret the line as a pattern delimited by < |
as otherwise \fBpcretest\fP will interpret the line as a pattern delimited by < |
745 |
characters. |
characters. |
792 |
.rs |
.rs |
793 |
.sp |
.sp |
794 |
.nf |
.nf |
795 |
Last updated: 18 December 2007 |
Last updated: 06 June 2011 |
796 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2011 University of Cambridge. |
797 |
.fi |
.fi |