/[pcre]/code/trunk/doc/pcrepattern.3
ViewVC logotype

Diff of /code/trunk/doc/pcrepattern.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1213 by ph10, Wed Nov 7 17:29:40 2012 UTC revision 1219 by ph10, Sun Nov 11 18:04:37 2012 UTC
# Line 1  Line 1 
1  .TH PCREPATTERN 3 "07 November 2012" "PCRE 8.32"  .TH PCREPATTERN 3 "11 November 2012" "PCRE 8.32"
2  .SH NAME  .SH NAME
3  PCRE - Perl-compatible regular expressions  PCRE - Perl-compatible regular expressions
4  .SH "PCRE REGULAR EXPRESSION DETAILS"  .SH "PCRE REGULAR EXPRESSION DETAILS"
# Line 22  description of PCRE's regular expression Line 22  description of PCRE's regular expression
22  .P  .P
23  The original operation of PCRE was on strings of one-byte characters. However,  The original operation of PCRE was on strings of one-byte characters. However,
24  there is now also support for UTF-8 strings in the original library, an  there is now also support for UTF-8 strings in the original library, an
25  extra library that supports 16-bit and UTF-16 character strings, and an  extra library that supports 16-bit and UTF-16 character strings, and a
26  extra library that supports 32-bit and UTF-32 character strings. To use these  third library that supports 32-bit and UTF-32 character strings. To use these
27  features, PCRE must be built to include appropriate support. When using UTF  features, PCRE must be built to include appropriate support. When using UTF
28  strings you must either call the compiling function with the PCRE_UTF8,  strings you must either call the compiling function with the PCRE_UTF8,
29  PCRE_UTF16 or PCRE_UTF32 option, or the pattern must start with one of  PCRE_UTF16, or PCRE_UTF32 option, or the pattern must start with one of
30  these special sequences:  these special sequences:
31  .sp  .sp
32    (*UTF8)    (*UTF8)
33    (*UTF16)    (*UTF16)
34    (*UTF32)    (*UTF32)
35      (*UTF)
36  .sp  .sp
37    (*UTF) is a generic sequence that can be used with any of the libraries.
38  Starting a pattern with such a sequence is equivalent to setting the relevant  Starting a pattern with such a sequence is equivalent to setting the relevant
39  option. This feature is not Perl-compatible. How setting a UTF mode affects  option. This feature is not Perl-compatible. How setting a UTF mode affects
40  pattern matching is mentioned in several places below. There is also a summary  pattern matching is mentioned in several places below. There is also a summary
# Line 43  of features in the Line 45  of features in the
45  page.  page.
46  .P  .P
47  Another special sequence that may appear at the start of a pattern or in  Another special sequence that may appear at the start of a pattern or in
48  combination with (*UTF8) or (*UTF16) or (*UTF32) is:  combination with (*UTF8), (*UTF16), (*UTF32) or (*UTF) is:
49  .sp  .sp
50    (*UCP)    (*UCP)
51  .sp  .sp
# Line 573  change of newline convention; for exampl Line 575  change of newline convention; for exampl
575  .sp  .sp
576    (*ANY)(*BSR_ANYCRLF)    (*ANY)(*BSR_ANYCRLF)
577  .sp  .sp
578  They can also be combined with the (*UTF8), (*UTF16), (*UTF32) or (*UCP) special  They can also be combined with the (*UTF8), (*UTF16), (*UTF32), (*UTF) or
579  sequences. Inside a character class, \eR is treated as an unrecognized escape  (*UCP) special sequences. Inside a character class, \eR is treated as an
580  sequence, and so matches the letter "R" by default, but causes an error if  unrecognized escape sequence, and so matches the letter "R" by default, but
581  PCRE_EXTRA is set.  causes an error if PCRE_EXTRA is set.
582  .  .
583  .  .
584  .\" HTML <a name="uniextseq"></a>  .\" HTML <a name="uniextseq"></a>
# Line 1349  the section entitled Line 1351  the section entitled
1351  .\" </a>  .\" </a>
1352  "Newline sequences"  "Newline sequences"
1353  .\"  .\"
1354  above. There are also the (*UTF8), (*UTF16),(*UTF32) and (*UCP) leading  above. There are also the (*UTF8), (*UTF16),(*UTF32), and (*UCP) leading
1355  sequences that can be used to set UTF and Unicode property modes; they are  sequences that can be used to set UTF and Unicode property modes; they are
1356  equivalent to setting the PCRE_UTF8, PCRE_UTF16, PCRE_UTF32 and the PCRE_UCP  equivalent to setting the PCRE_UTF8, PCRE_UTF16, PCRE_UTF32 and the PCRE_UCP
1357  options, respectively.  options, respectively. The (*UTF) sequence is a generic version that can be
1358    used with any of the libraries.
1359  .  .
1360  .  .
1361  .\" HTML <a name="subpattern"></a>  .\" HTML <a name="subpattern"></a>
# Line 2975  Cambridge CB2 3QH, England. Line 2978  Cambridge CB2 3QH, England.
2978  .rs  .rs
2979  .sp  .sp
2980  .nf  .nf
2981  Last updated: 07 November 2012  Last updated: 11 November 2012
2982  Copyright (c) 1997-2012 University of Cambridge.  Copyright (c) 1997-2012 University of Cambridge.
2983  .fi  .fi

Legend:
Removed from v.1213  
changed lines
  Added in v.1219

  ViewVC Help
Powered by ViewVC 1.1.5