/[pcre]/code/trunk/doc/pcrecpp.3
ViewVC logotype

Diff of /code/trunk/doc/pcrecpp.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 79 by nigel, Sat Feb 24 21:40:52 2007 UTC revision 263 by ph10, Mon Nov 12 16:53:25 2007 UTC
# Line 5  PCRE - Perl-compatible regular expressio Line 5  PCRE - Perl-compatible regular expressio
5  .rs  .rs
6  .sp  .sp
7  .B #include <pcrecpp.h>  .B #include <pcrecpp.h>
8  .PP  .
 .SM  
 .br  
9  .SH DESCRIPTION  .SH DESCRIPTION
10  .rs  .rs
11  .sp  .sp
12  The C++ wrapper for PCRE was provided by Google Inc. This brief man page was  The C++ wrapper for PCRE was provided by Google Inc. Some additional
13  constructed from the notes in the \fIpcrecpp.h\fP file, which should be  functionality was added by Giuseppe Maxia. This brief man page was constructed
14  consulted for further details.  from the notes in the \fIpcrecpp.h\fP file, which should be consulted for
15    further details.
16  .  .
17  .  .
18  .SH "MATCHING INTERFACE"  .SH "MATCHING INTERFACE"
# Line 80  The function returns true iff all of the Line 79  The function returns true iff all of the
79  .sp  .sp
80    c. The "i"th argument has a suitable type for holding the    c. The "i"th argument has a suitable type for holding the
81       string captured as the "i"th sub-pattern. If you pass in       string captured as the "i"th sub-pattern. If you pass in
82       NULL for the "i"th argument, or pass fewer arguments than       void * NULL for the "i"th argument, or a non-void * NULL
83         of the correct type, or pass fewer arguments than the
84       number of sub-patterns, "i"th captured sub-pattern is       number of sub-patterns, "i"th captured sub-pattern is
85       ignored.       ignored.
86  .sp  .sp
87    CAVEAT: An optional sub-pattern that does not exist in the matched
88    string is assigned the empty string. Therefore, the following will
89    return false (because the empty string is not a valid number):
90    .sp
91       int number;
92       pcrecpp::RE::FullMatch("abc", "[a-z]+(\e\ed+)?", &number);
93    .sp
94  The matching interface supports at most 16 arguments per call.  The matching interface supports at most 16 arguments per call.
95  If you need more, consider using the more general interface  If you need more, consider using the more general interface
96  \fBpcrecpp::RE::DoMatch\fP. See \fBpcrecpp.h\fP for the signature for  \fBpcrecpp::RE::DoMatch\fP. See \fBpcrecpp.h\fP for the signature for
97  \fBDoMatch\fP.  \fBDoMatch\fP.
98  .  .
99    .SH "QUOTING METACHARACTERS"
100    .rs
101    .sp
102    You can use the "QuoteMeta" operation to insert backslashes before all
103    potentially meaningful characters in a string. The returned string, used as a
104    regular expression, will exactly match the original string.
105    .sp
106      Example:
107         string quoted = RE::QuoteMeta(unquoted);
108    .sp
109    Note that it's legal to escape a character even if it has no special meaning in
110    a regular expression -- so this function does that. (This also makes it
111    identical to the perl function of the same name; see "perldoc -f quotemeta".)
112    For example, "1.5-2.0?" becomes "1\e.5\e-2\e.0\e?".
113    .
114  .SH "PARTIAL MATCHES"  .SH "PARTIAL MATCHES"
115  .rs  .rs
116  .sp  .sp
# Line 130  NOTE: The UTF8 flag is ignored if pcre w Line 152  NOTE: The UTF8 flag is ignored if pcre w
152        --enable-utf8 flag.        --enable-utf8 flag.
153  .  .
154  .  .
155    .SH "PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE"
156    .rs
157    .sp
158    PCRE defines some modifiers to change the behavior of the regular expression
159    engine. The C++ wrapper defines an auxiliary class, RE_Options, as a vehicle to
160    pass such modifiers to a RE class. Currently, the following modifiers are
161    supported:
162    .sp
163       modifier              description               Perl corresponding
164    .sp
165       PCRE_CASELESS         case insensitive match      /i
166       PCRE_MULTILINE        multiple lines match        /m
167       PCRE_DOTALL           dot matches newlines        /s
168       PCRE_DOLLAR_ENDONLY   $ matches only at end       N/A
169       PCRE_EXTRA            strict escape parsing       N/A
170       PCRE_EXTENDED         ignore whitespaces          /x
171       PCRE_UTF8             handles UTF8 chars          built-in
172       PCRE_UNGREEDY         reverses * and *?           N/A
173       PCRE_NO_AUTO_CAPTURE  disables capturing parens   N/A (*)
174    .sp
175    (*) Both Perl and PCRE allow non capturing parentheses by means of the
176    "?:" modifier within the pattern itself. e.g. (?:ab|cd) does not
177    capture, while (ab|cd) does.
178    .P
179    For a full account on how each modifier works, please check the
180    PCRE API reference page.
181    .P
182    For each modifier, there are two member functions whose name is made
183    out of the modifier in lowercase, without the "PCRE_" prefix. For
184    instance, PCRE_CASELESS is handled by
185    .sp
186      bool caseless()
187    .sp
188    which returns true if the modifier is set, and
189    .sp
190      RE_Options & set_caseless(bool)
191    .sp
192    which sets or unsets the modifier. Moreover, PCRE_EXTRA_MATCH_LIMIT can be
193    accessed through the \fBset_match_limit()\fR and \fBmatch_limit()\fR member
194    functions. Setting \fImatch_limit\fR to a non-zero value will limit the
195    execution of pcre to keep it from doing bad things like blowing the stack or
196    taking an eternity to return a result. A value of 5000 is good enough to stop
197    stack blowup in a 2MB thread stack. Setting \fImatch_limit\fR to zero disables
198    match limiting. Alternatively, you can call \fBmatch_limit_recursion()\fP
199    which uses PCRE_EXTRA_MATCH_LIMIT_RECURSION to limit how much PCRE
200    recurses. \fBmatch_limit()\fP limits the number of matches PCRE does;
201    \fBmatch_limit_recursion()\fP limits the depth of internal recursion, and
202    therefore the amount of stack that is used.
203    .P
204    Normally, to pass one or more modifiers to a RE class, you declare
205    a \fIRE_Options\fR object, set the appropriate options, and pass this
206    object to a RE constructor. Example:
207    .sp
208       RE_options opt;
209       opt.set_caseless(true);
210       if (RE("HELLO", opt).PartialMatch("hello world")) ...
211    .sp
212    RE_options has two constructors. The default constructor takes no arguments and
213    creates a set of flags that are off by default. The optional parameter
214    \fIoption_flags\fR is to facilitate transfer of legacy code from C programs.
215    This lets you do
216    .sp
217       RE(pattern,
218         RE_Options(PCRE_CASELESS|PCRE_MULTILINE)).PartialMatch(str);
219    .sp
220    However, new code is better off doing
221    .sp
222       RE(pattern,
223         RE_Options().set_caseless(true).set_multiline(true))
224           .PartialMatch(str);
225    .sp
226    If you are going to pass one of the most used modifiers, there are some
227    convenience functions that return a RE_Options class with the
228    appropriate modifier already set: \fBCASELESS()\fR, \fBUTF8()\fR,
229    \fBMULTILINE()\fR, \fBDOTALL\fR(), and \fBEXTENDED()\fR.
230    .P
231    If you need to set several options at once, and you don't want to go through
232    the pains of declaring a RE_Options object and setting several options, there
233    is a parallel method that give you such ability on the fly. You can concatenate
234    several \fBset_xxxxx()\fR member functions, since each of them returns a
235    reference to its class object. For example, to pass PCRE_CASELESS,
236    PCRE_EXTENDED, and PCRE_MULTILINE to a RE with one statement, you may write:
237    .sp
238       RE(" ^ xyz \e\es+ .* blah$",
239         RE_Options()
240           .set_caseless(true)
241           .set_extended(true)
242           .set_multiline(true)).PartialMatch(sometext);
243    .sp
244    .
245    .
246  .SH "SCANNING TEXT INCREMENTALLY"  .SH "SCANNING TEXT INCREMENTALLY"
247  .rs  .rs
248  .sp  .sp
# Line 215  string is left unaffected. Line 328  string is left unaffected.
328  .SH AUTHOR  .SH AUTHOR
329  .rs  .rs
330  .sp  .sp
331    .nf
332  The C++ wrapper was contributed by Google Inc.  The C++ wrapper was contributed by Google Inc.
333  .br  Copyright (c) 2007 Google Inc.
334  Copyright (c) 2005 Google Inc.  .fi
335    .
336    .
337    .SH REVISION
338    .rs
339    .sp
340    .nf
341    Last updated: 12 November 2007
342    .fi

Legend:
Removed from v.79  
changed lines
  Added in v.263

  ViewVC Help
Powered by ViewVC 1.1.5