/[pcre]/code/trunk/doc/pcreapi.3
ViewVC logotype

Diff of /code/trunk/doc/pcreapi.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 225 by ph10, Mon Aug 20 14:38:34 2007 UTC revision 226 by ph10, Tue Aug 21 11:46:08 2007 UTC
# Line 240  pair of characters that indicate a line Line 240  pair of characters that indicate a line
240  convention affects the handling of the dot, circumflex, and dollar  convention affects the handling of the dot, circumflex, and dollar
241  metacharacters, the handling of #-comments in /x mode, and, when CRLF is a  metacharacters, the handling of #-comments in /x mode, and, when CRLF is a
242  recognized line ending sequence, the match position advancement for a  recognized line ending sequence, the match position advancement for a
243  non-anchored pattern. The choice of newline convention does not affect the  non-anchored pattern. There is more detail about this in the
244  interpretation of the \en or \er escape sequences.  .\" HTML <a href="#execoptions">
245    .\" </a>
246    section on \fBpcre_exec()\fP options
247    .\"
248    below. The choice of newline convention does not affect the interpretation of
249    the \en or \er escape sequences.
250  .  .
251  .  .
252  .SH MULTITHREADING  .SH MULTITHREADING
# Line 882  table indicating a fixed set of bytes fo Line 887  table indicating a fixed set of bytes fo
887  string, a pointer to the table is returned. Otherwise NULL is returned. The  string, a pointer to the table is returned. Otherwise NULL is returned. The
888  fourth argument should point to an \fBunsigned char *\fP variable.  fourth argument should point to an \fBunsigned char *\fP variable.
889  .sp  .sp
890      PCRE_INFO_HASCRORLF
891    .sp
892    Return 1 if the pattern contains any explicit matches for CR or LF characters,
893    otherwise 0. The fourth argument should point to an \fBint\fP variable.
894    .sp
895    PCRE_INFO_JCHANGED    PCRE_INFO_JCHANGED
896  .sp  .sp
897  Return 1 if the (?J) option setting is used in the pattern, otherwise 0. The  Return 1 if the (?J) option setting is used in the pattern, otherwise 0. The
# Line 1169  called. See the Line 1179  called. See the
1179  .\"  .\"
1180  documentation for a discussion of saving compiled patterns for later use.  documentation for a discussion of saving compiled patterns for later use.
1181  .  .
1182    .\" HTML <a name="execoptions"></a>
1183  .SS "Option bits for \fBpcre_exec()\fP"  .SS "Option bits for \fBpcre_exec()\fP"
1184  .rs  .rs
1185  .sp  .sp
# Line 1194  the pattern was compiled. For details, s Line 1205  the pattern was compiled. For details, s
1205  \fBpcre_compile()\fP above. During matching, the newline choice affects the  \fBpcre_compile()\fP above. During matching, the newline choice affects the
1206  behaviour of the dot, circumflex, and dollar metacharacters. It may also alter  behaviour of the dot, circumflex, and dollar metacharacters. It may also alter
1207  the way the match position is advanced after a match failure for an unanchored  the way the match position is advanced after a match failure for an unanchored
1208  pattern. When PCRE_NEWLINE_CRLF, PCRE_NEWLINE_ANYCRLF, or PCRE_NEWLINE_ANY is  pattern.
1209  set, and a match attempt fails when the current position is at a CRLF sequence,  .P
1210  the match position is advanced by two characters instead of one, in other  When PCRE_NEWLINE_CRLF, PCRE_NEWLINE_ANYCRLF, or PCRE_NEWLINE_ANY is set, and a
1211  words, to after the CRLF.  match attempt for an unanchored pattern fails when the current position is at a
1212  .P  CRLF sequence, and the pattern contains no explicit matches for CR or NL
1213  Anomalous effects can occur when CRLF is a valid newline sequence and explicit  characters, the match position is advanced by two characters instead of one, in
1214  \er or \en escapes appear in the pattern. For example, the string "\er\enA"  other words, to after the CRLF.
1215  matches the unanchored pattern \enA but not [X\en]A. This happens because, in  .P
1216  the first case, PCRE knows that the match must start with \en, and so it skips  The above rule is a compromise that makes the most common cases work as
1217  there before trying to match. In the second case, it has no knowledge about the  expected. For example, if the pattern is .+A (and the PCRE_DOTALL option is not
1218  starting character, so it starts matching at the beginning of the string, and  set), it does not match the string "\er\enA" because, after failing at the
1219  on failing, skips over the CRLF as described above. However, if the pattern is  start, it skips both the CR and the LF before retrying. However, the pattern
1220  studied, the match succeeds, because then PCRE once again knows where to start.  [\er\en]A does match that string, because it contains an explicit CR or LF
1221    reference, and so advances only by one character after the first failure.
1222    Note than an explicit CR or LF reference occurs for negated character classes
1223    such as [^X] because they can match CR or LF characters.
1224    .P
1225    Notwithstanding the above, anomalous effects may still occur when CRLF is a
1226    valid newline sequence and explicit \er or \en escapes appear in the pattern.
1227  .sp  .sp
1228    PCRE_NOTBOL    PCRE_NOTBOL
1229  .sp  .sp
# Line 1895  Cambridge CB2 3QH, England. Line 1912  Cambridge CB2 3QH, England.
1912  .rs  .rs
1913  .sp  .sp
1914  .nf  .nf
1915  Last updated: 20 August 2007  Last updated: 21 August 2007
1916  Copyright (c) 1997-2007 University of Cambridge.  Copyright (c) 1997-2007 University of Cambridge.
1917  .fi  .fi

Legend:
Removed from v.225  
changed lines
  Added in v.226

  ViewVC Help
Powered by ViewVC 1.1.5