/[pcre]/code/trunk/doc/html/pcretest.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcretest.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 75 by nigel, Sat Feb 24 21:40:37 2007 UTC revision 87 by nigel, Sat Feb 24 21:41:21 2007 UTC
# Line 18  man page, in case the conversion went wr Line 18  man page, in case the conversion went wr
18  <li><a name="TOC3" href="#SEC3">DESCRIPTION</a>  <li><a name="TOC3" href="#SEC3">DESCRIPTION</a>
19  <li><a name="TOC4" href="#SEC4">PATTERN MODIFIERS</a>  <li><a name="TOC4" href="#SEC4">PATTERN MODIFIERS</a>
20  <li><a name="TOC5" href="#SEC5">DATA LINES</a>  <li><a name="TOC5" href="#SEC5">DATA LINES</a>
21  <li><a name="TOC6" href="#SEC6">OUTPUT FROM PCRETEST</a>  <li><a name="TOC6" href="#SEC6">THE ALTERNATIVE MATCHING FUNCTION</a>
22  <li><a name="TOC7" href="#SEC7">CALLOUTS</a>  <li><a name="TOC7" href="#SEC7">DEFAULT OUTPUT FROM PCRETEST</a>
23  <li><a name="TOC8" href="#SEC8">SAVING AND RELOADING COMPILED PATTERNS</a>  <li><a name="TOC8" href="#SEC8">OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION</a>
24  <li><a name="TOC9" href="#SEC9">AUTHOR</a>  <li><a name="TOC9" href="#SEC9">RESTARTING AFTER A PARTIAL MATCH</a>
25    <li><a name="TOC10" href="#SEC10">CALLOUTS</a>
26    <li><a name="TOC11" href="#SEC11">SAVING AND RELOADING COMPILED PATTERNS</a>
27    <li><a name="TOC12" href="#SEC12">AUTHOR</a>
28  </ul>  </ul>
29  <br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br>  <br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br>
30  <P>  <P>
31  <b>pcretest [-C] [-d] [-i] [-m] [-o osize] [-p] [-t] [source]</b>  <b>pcretest [-C] [-d] [-dfa] [-i] [-m] [-o osize] [-p] [-t] [source]</b>
32  <b>[destination]</b>  <b>[destination]</b>
33  </P>  </P>
34  <P>  <P>
# Line 47  about the optional features that are inc Line 50  about the optional features that are inc
50  </P>  </P>
51  <P>  <P>
52  <b>-d</b>  <b>-d</b>
53  Behave as if each regex had the <b>/D</b> (debug) modifier; the internal  Behave as if each regex has the <b>/D</b> (debug) modifier; the internal
54  form is output after compilation.  form is output after compilation.
55  </P>  </P>
56  <P>  <P>
57    <b>-dfa</b>
58    Behave as if each data line contains the \D escape sequence; this causes the
59    alternative matching function, <b>pcre_dfa_exec()</b>, to be used instead of the
60    standard <b>pcre_exec()</b> function (more detail is given below).
61    </P>
62    <P>
63  <b>-i</b>  <b>-i</b>
64  Behave as if each regex had the <b>/I</b> modifier; information about the  Behave as if each regex has the <b>/I</b> modifier; information about the
65  compiled pattern is given after compilation.  compiled pattern is given after compilation.
66  </P>  </P>
67  <P>  <P>
# Line 70  matching calls by including \O in the da Line 79  matching calls by including \O in the da
79  </P>  </P>
80  <P>  <P>
81  <b>-p</b>  <b>-p</b>
82  Behave as if each regex has <b>/P</b> modifier; the POSIX wrapper API is used  Behave as if each regex has the <b>/P</b> modifier; the POSIX wrapper API is
83  to call PCRE. None of the other options has any effect when <b>-p</b> is set.  used to call PCRE. None of the other options has any effect when <b>-p</b> is
84    set.
85    </P>
86    <P>
87    \fP-q\fP
88    Do not output the version number of <b>pcretest</b> at the start of execution.
89  </P>  </P>
90  <P>  <P>
91  <b>-t</b>  <b>-t</b>
# Line 152  not correspond to anything in Perl: Line 166  not correspond to anything in Perl:
166    <b>/A</b>    PCRE_ANCHORED    <b>/A</b>    PCRE_ANCHORED
167    <b>/C</b>    PCRE_AUTO_CALLOUT    <b>/C</b>    PCRE_AUTO_CALLOUT
168    <b>/E</b>    PCRE_DOLLAR_ENDONLY    <b>/E</b>    PCRE_DOLLAR_ENDONLY
169      <b>/f</b>    PCRE_FIRSTLINE
170    <b>/N</b>    PCRE_NO_AUTO_CAPTURE    <b>/N</b>    PCRE_NO_AUTO_CAPTURE
171    <b>/U</b>    PCRE_UNGREEDY    <b>/U</b>    PCRE_UNGREEDY
172    <b>/X</b>    PCRE_EXTRA    <b>/X</b>    PCRE_EXTRA
# Line 274  recognized: Line 289  recognized:
289    \C!n       return 1 instead of 0 when callout number n is reached    \C!n       return 1 instead of 0 when callout number n is reached
290    \C!n!m     return 1 instead of 0 when callout number n is reached for the nth time    \C!n!m     return 1 instead of 0 when callout number n is reached for the nth time
291    \C*n       pass the number n (may be negative) as callout data; this is used as the callout return value    \C*n       pass the number n (may be negative) as callout data; this is used as the callout return value
292      \D         use the <b>pcre_dfa_exec()</b> match function
293      \F         only shortest match for <b>pcre_dfa_exec()</b>
294    \Gdd       call pcre_get_substring() for substring dd after a successful match (number less than 32)    \Gdd       call pcre_get_substring() for substring dd after a successful match (number less than 32)
295    \Gname     call pcre_get_named_substring() for substring "name" after a successful match (name termin-    \Gname     call pcre_get_named_substring() for substring "name" after a successful match (name termin-
296                 ated by next non-alphanumeric character)                 ated by next non-alphanumeric character)
297    \L         call pcre_get_substringlist() after a successful match    \L         call pcre_get_substringlist() after a successful match
298    \M         discover the minimum MATCH_LIMIT setting    \M         discover the minimum MATCH_LIMIT and
299                   MATCH_LIMIT_RECURSION settings
300    \N         pass the PCRE_NOTEMPTY option to <b>pcre_exec()</b>    \N         pass the PCRE_NOTEMPTY option to <b>pcre_exec()</b>
301    \Odd       set the size of the output vector passed to <b>pcre_exec()</b> to dd (any number of digits)    \Odd       set the size of the output vector passed to <b>pcre_exec()</b> to dd (any number of digits)
302    \P         pass the PCRE_PARTIAL option to <b>pcre_exec()</b>    \P         pass the PCRE_PARTIAL option to <b>pcre_exec()</b> or <b>pcre_dfa_exec()</b>
303      \R         pass the PCRE_DFA_RESTART option to <b>pcre_dfa_exec()</b>
304    \S         output details of memory get/free calls during matching    \S         output details of memory get/free calls during matching
305    \Z         pass the PCRE_NOTEOL option to <b>pcre_exec()</b>    \Z         pass the PCRE_NOTEOL option to <b>pcre_exec()</b>
306    \?         pass the PCRE_NO_UTF8_CHECK option to <b>pcre_exec()</b>    \?         pass the PCRE_NO_UTF8_CHECK option to <b>pcre_exec()</b>
# Line 294  an empty line as data, since a real empt Line 313  an empty line as data, since a real empt
313  </P>  </P>
314  <P>  <P>
315  If \M is present, <b>pcretest</b> calls <b>pcre_exec()</b> several times, with  If \M is present, <b>pcretest</b> calls <b>pcre_exec()</b> several times, with
316  different values in the <i>match_limit</i> field of the <b>pcre_extra</b> data  different values in the <i>match_limit</i> and <i>match_limit_recursion</i>
317  structure, until it finds the minimum number that is needed for  fields of the <b>pcre_extra</b> data structure, until it finds the minimum
318  <b>pcre_exec()</b> to complete. This number is a measure of the amount of  numbers for each parameter that allow <b>pcre_exec()</b> to complete. The
319  recursion and backtracking that takes place, and checking it out can be  <i>match_limit</i> number is a measure of the amount of backtracking that takes
320  instructive. For most simple matches, the number is quite small, but for  place, and checking it out can be instructive. For most simple matches, the
321  patterns with very large numbers of matching possibilities, it can become large  number is quite small, but for patterns with very large numbers of matching
322  very quickly with increasing length of subject string.  possibilities, it can become large very quickly with increasing length of
323    subject string. The <i>match_limit_recursion</i> number is a measure of how much
324    stack (or, if PCRE is compiled with NO_RECURSE, how much heap) memory is needed
325    to complete the match attempt.
326  </P>  </P>
327  <P>  <P>
328  When \O is used, the value specified may be higher or lower than the size set  When \O is used, the value specified may be higher or lower than the size set
# Line 309  the call of <b>pcre_exec()</b> for the l Line 331  the call of <b>pcre_exec()</b> for the l
331  </P>  </P>
332  <P>  <P>
333  If the <b>/P</b> modifier was present on the pattern, causing the POSIX wrapper  If the <b>/P</b> modifier was present on the pattern, causing the POSIX wrapper
334  API to be used, only \B and \Z have any effect, causing REG_NOTBOL and  API to be used, the only option-setting sequences that have any effect are \B
335  REG_NOTEOL to be passed to <b>regexec()</b> respectively.  and \Z, causing REG_NOTBOL and REG_NOTEOL, respectively, to be passed to
336    <b>regexec()</b>.
337  </P>  </P>
338  <P>  <P>
339  The use of \x{hh...} to represent UTF-8 characters is not dependent on the use  The use of \x{hh...} to represent UTF-8 characters is not dependent on the use
# Line 318  of the <b>/8</b> modifier on the pattern Line 341  of the <b>/8</b> modifier on the pattern
341  any number of hexadecimal digits inside the braces. The result is from one to  any number of hexadecimal digits inside the braces. The result is from one to
342  six bytes, encoded according to the UTF-8 rules.  six bytes, encoded according to the UTF-8 rules.
343  </P>  </P>
344  <br><a name="SEC6" href="#TOC1">OUTPUT FROM PCRETEST</a><br>  <br><a name="SEC6" href="#TOC1">THE ALTERNATIVE MATCHING FUNCTION</a><br>
345    <P>
346    By default, <b>pcretest</b> uses the standard PCRE matching function,
347    <b>pcre_exec()</b> to match each data line. From release 6.0, PCRE supports an
348    alternative matching function, <b>pcre_dfa_test()</b>, which operates in a
349    different way, and has some restrictions. The differences between the two
350    functions are described in the
351    <a href="pcrematching.html"><b>pcrematching</b></a>
352    documentation.
353    </P>
354    <P>
355    If a data line contains the \D escape sequence, or if the command line
356    contains the <b>-dfa</b> option, the alternative matching function is called.
357    This function finds all possible matches at a given point. If, however, the \F
358    escape sequence is present in the data line, it stops after the first match is
359    found. This is always the shortest possible match.
360    </P>
361    <br><a name="SEC7" href="#TOC1">DEFAULT OUTPUT FROM PCRETEST</a><br>
362    <P>
363    This section describes the output when the normal matching function,
364    <b>pcre_exec()</b>, is being used.
365    </P>
366  <P>  <P>
367  When a match succeeds, pcretest outputs the list of captured substrings that  When a match succeeds, pcretest outputs the list of captured substrings that
368  <b>pcre_exec()</b> returns, starting with number 0 for the string that matched  <b>pcre_exec()</b> returns, starting with number 0 for the string that matched
369  the whole pattern. Otherwise, it outputs "No match" or "Partial match"  the whole pattern. Otherwise, it outputs "No match" or "Partial match"
370  when <b>pcre_exec()</b> returns PCRE_ERROR_NOMATCH or PCRE_ERROR_PARTIAL,  when <b>pcre_exec()</b> returns PCRE_ERROR_NOMATCH or PCRE_ERROR_PARTIAL,
371  respectively, and otherwise the PCRE negative error number. Here is an example  respectively, and otherwise the PCRE negative error number. Here is an example
372  of an interactive pcretest run.  of an interactive <b>pcretest</b> run.
373  <pre>  <pre>
374    $ pcretest    $ pcretest
375    PCRE version 5.00 07-Sep-2004    PCRE version 5.00 07-Sep-2004
# Line 375  Note that while patterns can be continue Line 419  Note that while patterns can be continue
419  prompt is used for continuations), data lines may not. However newlines can be  prompt is used for continuations), data lines may not. However newlines can be
420  included in data by means of the \n escape.  included in data by means of the \n escape.
421  </P>  </P>
422  <br><a name="SEC7" href="#TOC1">CALLOUTS</a><br>  <br><a name="SEC8" href="#TOC1">OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION</a><br>
423    <P>
424    When the alternative matching function, <b>pcre_dfa_exec()</b>, is used (by
425    means of the \D escape sequence or the <b>-dfa</b> command line option), the
426    output consists of a list of all the matches that start at the first point in
427    the subject where there is at least one match. For example:
428    <pre>
429        re&#62; /(tang|tangerine|tan)/
430      data&#62; yellow tangerine\D
431       0: tangerine
432       1: tang
433       2: tan
434    </pre>
435    (Using the normal matching function on this data finds only "tang".) The
436    longest matching string is always given first (and numbered zero).
437    </P>
438    <P>
439    If \fB/g\P is present on the pattern, the search for further matches resumes
440    at the end of the longest match. For example:
441    <pre>
442        re&#62; /(tang|tangerine|tan)/g
443      data&#62; yellow tangerine and tangy sultana\D
444       0: tangerine
445       1: tang
446       2: tan
447       0: tang
448       1: tan
449       0: tan
450    </pre>
451    Since the matching function does not support substring capture, the escape
452    sequences that are concerned with captured substrings are not relevant.
453    </P>
454    <br><a name="SEC9" href="#TOC1">RESTARTING AFTER A PARTIAL MATCH</a><br>
455    <P>
456    When the alternative matching function has given the PCRE_ERROR_PARTIAL return,
457    indicating that the subject partially matched the pattern, you can restart the
458    match with additional subject data by means of the \R escape sequence. For
459    example:
460    <pre>
461        re&#62; /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
462      data&#62; 23ja\P\D
463      Partial match: 23ja
464      data&#62; n05\R\D
465       0: n05
466    </pre>
467    For further information about partial matching, see the
468    <a href="pcrepartial.html"><b>pcrepartial</b></a>
469    documentation.
470    </P>
471    <br><a name="SEC10" href="#TOC1">CALLOUTS</a><br>
472  <P>  <P>
473  If the pattern contains any callout requests, <b>pcretest</b>'s callout function  If the pattern contains any callout requests, <b>pcretest</b>'s callout function
474  is called during matching. By default, it displays the callout number, the  is called during matching. This works with both matching functions. By default,
475  start and current positions in the text at the callout time, and the next  the called function displays the callout number, the start and current
476  pattern item to be tested. For example, the output  positions in the text at the callout time, and the next pattern item to be
477    tested. For example, the output
478  <pre>  <pre>
479    ---&#62;pqrabcdef    ---&#62;pqrabcdef
480      0    ^  ^     \d      0    ^  ^     \d
# Line 406  example: Line 500  example:
500     0: E*     0: E*
501  </pre>  </pre>
502  The callout function in <b>pcretest</b> returns zero (carry on matching) by  The callout function in <b>pcretest</b> returns zero (carry on matching) by
503  default, but you can use an \C item in a data line (as described above) to  default, but you can use a \C item in a data line (as described above) to
504  change this.  change this.
505  </P>  </P>
506  <P>  <P>
# Line 416  the Line 510  the
510  <a href="pcrecallout.html"><b>pcrecallout</b></a>  <a href="pcrecallout.html"><b>pcrecallout</b></a>
511  documentation.  documentation.
512  </P>  </P>
513  <br><a name="SEC8" href="#TOC1">SAVING AND RELOADING COMPILED PATTERNS</a><br>  <br><a name="SEC11" href="#TOC1">SAVING AND RELOADING COMPILED PATTERNS</a><br>
514  <P>  <P>
515  The facilities described in this section are not available when the POSIX  The facilities described in this section are not available when the POSIX
516  inteface to PCRE is being used, that is, when the <b>/P</b> pattern modifier is  inteface to PCRE is being used, that is, when the <b>/P</b> pattern modifier is
# Line 478  string using a reloaded pattern is likel Line 572  string using a reloaded pattern is likel
572  Finally, if you attempt to load a file that is not in the correct format, the  Finally, if you attempt to load a file that is not in the correct format, the
573  result is undefined.  result is undefined.
574  </P>  </P>
575  <br><a name="SEC9" href="#TOC1">AUTHOR</a><br>  <br><a name="SEC12" href="#TOC1">AUTHOR</a><br>
576  <P>  <P>
577  Philip Hazel &#60;ph10@cam.ac.uk&#62;  Philip Hazel
578  <br>  <br>
579  University Computing Service,  University Computing Service,
580  <br>  <br>
581  Cambridge CB2 3QG, England.  Cambridge CB2 3QG, England.
582  </P>  </P>
583  <P>  <P>
584  Last updated: 10 September 2004  Last updated: 18 January 2006
585  <br>  <br>
586  Copyright &copy; 1997-2004 University of Cambridge.  Copyright &copy; 1997-2006 University of Cambridge.
587  <p>  <p>
588  Return to the <a href="index.html">PCRE index page</a>.  Return to the <a href="index.html">PCRE index page</a>.
589  </p>  </p>

Legend:
Removed from v.75  
changed lines
  Added in v.87

  ViewVC Help
Powered by ViewVC 1.1.5