/[pcre]/code/trunk/doc/pcretest.txt
ViewVC logotype

Diff of /code/trunk/doc/pcretest.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 96 by nigel, Fri Mar 2 13:10:43 2007 UTC revision 286 by ph10, Mon Dec 17 14:46:11 2007 UTC
# Line 146  PATTERN MODIFIERS Line 146  PATTERN MODIFIERS
146         The following table shows additional modifiers for setting PCRE options         The following table shows additional modifiers for setting PCRE options
147         that do not correspond to anything in Perl:         that do not correspond to anything in Perl:
148    
149           /A       PCRE_ANCHORED           /A              PCRE_ANCHORED
150           /C       PCRE_AUTO_CALLOUT           /C              PCRE_AUTO_CALLOUT
151           /E       PCRE_DOLLAR_ENDONLY           /E              PCRE_DOLLAR_ENDONLY
152           /f       PCRE_FIRSTLINE           /f              PCRE_FIRSTLINE
153           /J       PCRE_DUPNAMES           /J              PCRE_DUPNAMES
154           /N       PCRE_NO_AUTO_CAPTURE           /N              PCRE_NO_AUTO_CAPTURE
155           /U       PCRE_UNGREEDY           /U              PCRE_UNGREEDY
156           /X       PCRE_EXTRA           /X              PCRE_EXTRA
157           /<cr>    PCRE_NEWLINE_CR           /<cr>           PCRE_NEWLINE_CR
158           /<lf>    PCRE_NEWLINE_LF           /<lf>           PCRE_NEWLINE_LF
159           /<crlf>  PCRE_NEWLINE_CRLF           /<crlf>         PCRE_NEWLINE_CRLF
160           /<any>   PCRE_NEWLINE_ANY           /<anycrlf>      PCRE_NEWLINE_ANYCRLF
161             /<any>          PCRE_NEWLINE_ANY
162         Those  specifying  line ending sequencess are literal strings as shown.           /<bsr_anycrlf>  PCRE_BSR_ANYCRLF
163         This example sets multiline matching  with  CRLF  as  the  line  ending           /<bsr_unicode>  PCRE_BSR_UNICODE
164         sequence:  
165           Those  specifying  line  ending sequences are literal strings as shown,
166           but the letters can be in either  case.  This  example  sets  multiline
167           matching with CRLF as the line ending sequence:
168    
169           /^abc/m<crlf>           /^abc/m<crlf>
170    
# Line 197  PATTERN MODIFIERS Line 200  PATTERN MODIFIERS
200         subject contains multiple copies of the same substring.         subject contains multiple copies of the same substring.
201    
202         The  /B modifier is a debugging feature. It requests that pcretest out-         The  /B modifier is a debugging feature. It requests that pcretest out-
203         put a representation of the compiled byte code after compilation.         put a representation of the compiled byte code after compilation.  Nor-
204           mally  this  information contains length and offset values; however, if
205           /Z is also present, this data is replaced by spaces. This is a  special
206           feature for use in the automatic test scripts; it ensures that the same
207           output is generated for different internal link sizes.
208    
209         The /L modifier must be followed directly by the name of a locale,  for         The /L modifier must be followed directly by the name of a locale,  for
210         example,         example,
# Line 326  DATA LINES Line 333  DATA LINES
333                        or pcre_dfa_exec()                        or pcre_dfa_exec()
334           \<crlf>    pass the PCRE_NEWLINE_CRLF option to pcre_exec()           \<crlf>    pass the PCRE_NEWLINE_CRLF option to pcre_exec()
335                        or pcre_dfa_exec()                        or pcre_dfa_exec()
336             \<anycrlf> pass the PCRE_NEWLINE_ANYCRLF option to pcre_exec()
337                          or pcre_dfa_exec()
338           \<any>     pass the PCRE_NEWLINE_ANY option to pcre_exec()           \<any>     pass the PCRE_NEWLINE_ANY option to pcre_exec()
339                        or pcre_dfa_exec()                        or pcre_dfa_exec()
340    
# Line 362  DATA LINES Line 371  DATA LINES
371         The use of \x{hh...} to represent UTF-8 characters is not dependent  on         The use of \x{hh...} to represent UTF-8 characters is not dependent  on
372         the  use  of  the  /8 modifier on the pattern. It is recognized always.         the  use  of  the  /8 modifier on the pattern. It is recognized always.
373         There may be any number of hexadecimal digits inside  the  braces.  The         There may be any number of hexadecimal digits inside  the  braces.  The
374         result  is from one to six bytes, encoded according to the UTF-8 rules.         result  is  from  one  to  six bytes, encoded according to the original
375           UTF-8 rules of RFC 2279. This allows for  values  in  the  range  0  to
376           0x7FFFFFFF.  Note  that not all of those are valid Unicode code points,
377           or indeed valid UTF-8 characters according to the later  rules  in  RFC
378           3629.
379    
380    
381  THE ALTERNATIVE MATCHING FUNCTION  THE ALTERNATIVE MATCHING FUNCTION
382    
383         By  default,  pcretest  uses  the  standard  PCRE  matching   function,         By   default,  pcretest  uses  the  standard  PCRE  matching  function,
384         pcre_exec() to match each data line. From release 6.0, PCRE supports an         pcre_exec() to match each data line. From release 6.0, PCRE supports an
385         alternative matching function, pcre_dfa_test(),  which  operates  in  a         alternative  matching  function,  pcre_dfa_test(),  which operates in a
386         different  way,  and has some restrictions. The differences between the         different way, and has some restrictions. The differences  between  the
387         two functions are described in the pcrematching documentation.         two functions are described in the pcrematching documentation.
388    
389         If a data line contains the \D escape sequence, or if the command  line         If  a data line contains the \D escape sequence, or if the command line
390         contains  the -dfa option, the alternative matching function is called.         contains the -dfa option, the alternative matching function is  called.
391         This function finds all possible matches at a given point. If, however,         This function finds all possible matches at a given point. If, however,
392         the  \F escape sequence is present in the data line, it stops after the         the \F escape sequence is present in the data line, it stops after  the
393         first match is found. This is always the shortest possible match.         first match is found. This is always the shortest possible match.
394    
395    
396  DEFAULT OUTPUT FROM PCRETEST  DEFAULT OUTPUT FROM PCRETEST
397    
398         This section describes the output when the  normal  matching  function,         This  section  describes  the output when the normal matching function,
399         pcre_exec(), is being used.         pcre_exec(), is being used.
400    
401         When a match succeeds, pcretest outputs the list of captured substrings         When a match succeeds, pcretest outputs the list of captured substrings
402         that pcre_exec() returns, starting with number 0 for  the  string  that         that  pcre_exec()  returns,  starting with number 0 for the string that
403         matched the whole pattern. Otherwise, it outputs "No match" or "Partial         matched the whole pattern. Otherwise, it outputs "No match" or "Partial
404         match" when pcre_exec() returns PCRE_ERROR_NOMATCH  or  PCRE_ERROR_PAR-         match"  when  pcre_exec() returns PCRE_ERROR_NOMATCH or PCRE_ERROR_PAR-
405         TIAL,  respectively, and otherwise the PCRE negative error number. Here         TIAL, respectively, and otherwise the PCRE negative error number.  Here
406         is an example of an interactive pcretest run.         is an example of an interactive pcretest run.
407    
408           $ pcretest           $ pcretest
# Line 402  DEFAULT OUTPUT FROM PCRETEST Line 415  DEFAULT OUTPUT FROM PCRETEST
415           data> xyz           data> xyz
416           No match           No match
417    
418           Note  that unset capturing substrings that are not followed by one that
419           is set are not returned by pcre_exec(), and are not shown by  pcretest.
420           In  the following example, there are two capturing substrings, but when
421           the first data line is matched, the  second,  unset  substring  is  not
422           shown.  An "internal" unset substring is shown as "<unset>", as for the
423           second data line.
424    
425               re> /(a)|(b)/
426             data> a
427              0: a
428              1: a
429             data> b
430              0: b
431              1: <unset>
432              2: b
433    
434         If the strings contain any non-printing characters, they are output  as         If the strings contain any non-printing characters, they are output  as
435         \0x  escapes,  or  as \x{...} escapes if the /8 modifier was present on         \0x  escapes,  or  as \x{...} escapes if the /8 modifier was present on
436         the pattern. See below for the definition of  non-printing  characters.         the pattern. See below for the definition of  non-printing  characters.
# Line 481  RESTARTING AFTER A PARTIAL MATCH Line 510  RESTARTING AFTER A PARTIAL MATCH
510         can restart the match with additional subject data by means of  the  \R         can restart the match with additional subject data by means of  the  \R
511         escape sequence. For example:         escape sequence. For example:
512    
513             re> /^?(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)$/             re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
514           data> 23ja\P\D           data> 23ja\P\D
515           Partial match: 23ja           Partial match: 23ja
516           data> n05\R\D           data> n05\R\D
# Line 608  SEE ALSO Line 637  SEE ALSO
637  AUTHOR  AUTHOR
638    
639         Philip Hazel         Philip Hazel
640         University Computing Service,         University Computing Service
641         Cambridge CB2 3QH, England.         Cambridge CB2 3QH, England.
642    
643  Last updated: 30 November 2006  
644  Copyright (c) 1997-2006 University of Cambridge.  REVISION
645    
646           Last updated: 19 November 2007
647           Copyright (c) 1997-2007 University of Cambridge.

Legend:
Removed from v.96  
changed lines
  Added in v.286

  ViewVC Help
Powered by ViewVC 1.1.5