/[pcre]/code/trunk/doc/pcrecpp.3
ViewVC logotype

Diff of /code/trunk/doc/pcrecpp.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 79 by nigel, Sat Feb 24 21:40:52 2007 UTC revision 96 by nigel, Fri Mar 2 13:10:43 2007 UTC
# Line 11  PCRE - Perl-compatible regular expressio Line 11  PCRE - Perl-compatible regular expressio
11  .SH DESCRIPTION  .SH DESCRIPTION
12  .rs  .rs
13  .sp  .sp
14  The C++ wrapper for PCRE was provided by Google Inc. This brief man page was  The C++ wrapper for PCRE was provided by Google Inc. Some additional
15  constructed from the notes in the \fIpcrecpp.h\fP file, which should be  functionality was added by Giuseppe Maxia. This brief man page was constructed
16  consulted for further details.  from the notes in the \fIpcrecpp.h\fP file, which should be consulted for
17    further details.
18  .  .
19  .  .
20  .SH "MATCHING INTERFACE"  .SH "MATCHING INTERFACE"
# Line 84  The function returns true iff all of the Line 85  The function returns true iff all of the
85       number of sub-patterns, "i"th captured sub-pattern is       number of sub-patterns, "i"th captured sub-pattern is
86       ignored.       ignored.
87  .sp  .sp
88    CAVEAT: An optional sub-pattern that does not exist in the matched
89    string is assigned the empty string. Therefore, the following will
90    return false (because the empty string is not a valid number):
91    .sp
92       int number;
93       pcrecpp::RE::FullMatch("abc", "[a-z]+(\\d+)?", &number);
94    .sp
95  The matching interface supports at most 16 arguments per call.  The matching interface supports at most 16 arguments per call.
96  If you need more, consider using the more general interface  If you need more, consider using the more general interface
97  \fBpcrecpp::RE::DoMatch\fP. See \fBpcrecpp.h\fP for the signature for  \fBpcrecpp::RE::DoMatch\fP. See \fBpcrecpp.h\fP for the signature for
98  \fBDoMatch\fP.  \fBDoMatch\fP.
99  .  .
100    .SH "QUOTING METACHARACTERS"
101    .rs
102    .sp
103    You can use the "QuoteMeta" operation to insert backslashes before all
104    potentially meaningful characters in a string. The returned string, used as a
105    regular expression, will exactly match the original string.
106    .sp
107      Example:
108         string quoted = RE::QuoteMeta(unquoted);
109    .sp
110    Note that it's legal to escape a character even if it has no special meaning in
111    a regular expression -- so this function does that. (This also makes it
112    identical to the perl function of the same name; see "perldoc -f quotemeta".)
113    For example, "1.5-2.0?" becomes "1\e.5\e-2\e.0\e?".
114    .
115  .SH "PARTIAL MATCHES"  .SH "PARTIAL MATCHES"
116  .rs  .rs
117  .sp  .sp
# Line 130  NOTE: The UTF8 flag is ignored if pcre w Line 153  NOTE: The UTF8 flag is ignored if pcre w
153        --enable-utf8 flag.        --enable-utf8 flag.
154  .  .
155  .  .
156    .SH "PASSING MODIFIERS TO THE REGULAR EXPRESSION ENGINE"
157    .rs
158    .sp
159    PCRE defines some modifiers to change the behavior of the regular expression
160    engine. The C++ wrapper defines an auxiliary class, RE_Options, as a vehicle to
161    pass such modifiers to a RE class. Currently, the following modifiers are
162    supported:
163    .sp
164       modifier              description               Perl corresponding
165    .sp
166       PCRE_CASELESS         case insensitive match      /i
167       PCRE_MULTILINE        multiple lines match        /m
168       PCRE_DOTALL           dot matches newlines        /s
169       PCRE_DOLLAR_ENDONLY   $ matches only at end       N/A
170       PCRE_EXTRA            strict escape parsing       N/A
171       PCRE_EXTENDED         ignore whitespaces          /x
172       PCRE_UTF8             handles UTF8 chars          built-in
173       PCRE_UNGREEDY         reverses * and *?           N/A
174       PCRE_NO_AUTO_CAPTURE  disables capturing parens   N/A (*)
175    .sp
176    (*) Both Perl and PCRE allow non capturing parentheses by means of the
177    "?:" modifier within the pattern itself. e.g. (?:ab|cd) does not
178    capture, while (ab|cd) does.
179    .P
180    For a full account on how each modifier works, please check the
181    PCRE API reference page.
182    .P
183    For each modifier, there are two member functions whose name is made
184    out of the modifier in lowercase, without the "PCRE_" prefix. For
185    instance, PCRE_CASELESS is handled by
186    .sp
187      bool caseless()
188    .sp
189    which returns true if the modifier is set, and
190    .sp
191      RE_Options & set_caseless(bool)
192    .sp
193    which sets or unsets the modifier. Moreover, PCRE_EXTRA_MATCH_LIMIT can be
194    accessed through the \fBset_match_limit()\fR and \fBmatch_limit()\fR member
195    functions. Setting \fImatch_limit\fR to a non-zero value will limit the
196    execution of pcre to keep it from doing bad things like blowing the stack or
197    taking an eternity to return a result. A value of 5000 is good enough to stop
198    stack blowup in a 2MB thread stack. Setting \fImatch_limit\fR to zero disables
199    match limiting. Alternatively, you can call \fBmatch_limit_recursion()\fP
200    which uses PCRE_EXTRA_MATCH_LIMIT_RECURSION to limit how much PCRE
201    recurses. \fBmatch_limit()\fP limits the number of matches PCRE does;
202    \fBmatch_limit_recursion()\fP limits the depth of internal recursion, and
203    therefore the amount of stack that is used.
204    .P
205    Normally, to pass one or more modifiers to a RE class, you declare
206    a \fIRE_Options\fR object, set the appropriate options, and pass this
207    object to a RE constructor. Example:
208    .sp
209       RE_options opt;
210       opt.set_caseless(true);
211       if (RE("HELLO", opt).PartialMatch("hello world")) ...
212    .sp
213    RE_options has two constructors. The default constructor takes no arguments and
214    creates a set of flags that are off by default. The optional parameter
215    \fIoption_flags\fR is to facilitate transfer of legacy code from C programs.
216    This lets you do
217    .sp
218       RE(pattern,
219         RE_Options(PCRE_CASELESS|PCRE_MULTILINE)).PartialMatch(str);
220    .sp
221    However, new code is better off doing
222    .sp
223       RE(pattern,
224         RE_Options().set_caseless(true).set_multiline(true))
225           .PartialMatch(str);
226    .sp
227    If you are going to pass one of the most used modifiers, there are some
228    convenience functions that return a RE_Options class with the
229    appropriate modifier already set: \fBCASELESS()\fR, \fBUTF8()\fR,
230    \fBMULTILINE()\fR, \fBDOTALL\fR(), and \fBEXTENDED()\fR.
231    .P
232    If you need to set several options at once, and you don't want to go through
233    the pains of declaring a RE_Options object and setting several options, there
234    is a parallel method that give you such ability on the fly. You can concatenate
235    several \fBset_xxxxx()\fR member functions, since each of them returns a
236    reference to its class object. For example, to pass PCRE_CASELESS,
237    PCRE_EXTENDED, and PCRE_MULTILINE to a RE with one statement, you may write:
238    .sp
239       RE(" ^ xyz \e\es+ .* blah$",
240         RE_Options()
241           .set_caseless(true)
242           .set_extended(true)
243           .set_multiline(true)).PartialMatch(sometext);
244    .sp
245    .
246    .
247  .SH "SCANNING TEXT INCREMENTALLY"  .SH "SCANNING TEXT INCREMENTALLY"
248  .rs  .rs
249  .sp  .sp
# Line 217  string is left unaffected. Line 331  string is left unaffected.
331  .sp  .sp
332  The C++ wrapper was contributed by Google Inc.  The C++ wrapper was contributed by Google Inc.
333  .br  .br
334  Copyright (c) 2005 Google Inc.  Copyright (c) 2006 Google Inc.

Legend:
Removed from v.79  
changed lines
  Added in v.96

  ViewVC Help
Powered by ViewVC 1.1.5