/[pcre]/code/trunk/doc/html/pcre.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcre.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 517 by ph10, Wed Mar 10 16:08:01 2010 UTC revision 518 by ph10, Tue May 18 15:47:01 2010 UTC
# Line 30  support for one or two .NET and Onigurum Line 30  support for one or two .NET and Onigurum
30  for requesting some minor changes that give better JavaScript compatibility.  for requesting some minor changes that give better JavaScript compatibility.
31  </P>  </P>
32  <P>  <P>
33  The current implementation of PCRE corresponds approximately with Perl 5.10,  The current implementation of PCRE corresponds approximately with Perl
34  including support for UTF-8 encoded strings and Unicode general category  5.10/5.11, including support for UTF-8 encoded strings and Unicode general
35  properties. However, UTF-8 and Unicode support has to be explicitly enabled; it  category properties. However, UTF-8 and Unicode support has to be explicitly
36  is not the default. The Unicode tables correspond to Unicode release 5.2.0.  enabled; it is not the default. The Unicode tables correspond to Unicode
37    release 5.2.0.
38  </P>  </P>
39  <P>  <P>
40  In addition to the Perl-compatible matching function, PCRE contains an  In addition to the Perl-compatible matching function, PCRE contains an
# Line 255  the alternative matching function, <b>pc Line 256  the alternative matching function, <b>pc
256  </P>  </P>
257  <P>  <P>
258  6. The character escapes \b, \B, \d, \D, \s, \S, \w, and \W correctly  6. The character escapes \b, \B, \d, \D, \s, \S, \w, and \W correctly
259  test characters of any code value, but the characters that PCRE recognizes as  test characters of any code value, but, by default, the characters that PCRE
260  digits, spaces, or word characters remain the same set as before, all with  recognizes as digits, spaces, or word characters remain the same set as before,
261  values less than 256. This remains true even when PCRE includes Unicode  all with values less than 256. This remains true even when PCRE is built to
262  property support, because to do otherwise would slow down PCRE in many common  include Unicode property support, because to do otherwise would slow down PCRE
263  cases. If you really want to test for a wider sense of, say, "digit", you  in many common cases. Note that this also applies to \b, because it is defined
264  must use Unicode property tests such as \p{Nd}. Note that this also applies to  in terms of \w and \W. If you really want to test for a wider sense of, say,
265  \b, because it is defined in terms of \w and \W.  "digit", you can use explicit Unicode property tests such as \p{Nd}.
266    Alternatively, if you set the PCRE_UCP option, the way that the character
267    escapes work is changed so that Unicode properties are used to determine which
268    characters match. There are more details in the section on
269    <a href="pcrepattern.html#genericchartypes">generic character types</a>
270    in the
271    <a href="pcrepattern.html"><b>pcrepattern</b></a>
272    documentation.
273  </P>  </P>
274  <P>  <P>
275  7. Similarly, characters that match the POSIX named character classes are all  7. Similarly, characters that match the POSIX named character classes are all
276  low-valued characters.  low-valued characters, unless the PCRE_UCP option is set.
277  </P>  </P>
278  <P>  <P>
279  8. However, the Perl 5.10 horizontal and vertical whitespace matching escapes  8. However, the Perl 5.10 horizontal and vertical whitespace matching escapes
280  (\h, \H, \v, and \V) do match all the appropriate Unicode characters.  (\h, \H, \v, and \V) do match all the appropriate Unicode characters,
281    whether or not PCRE_UCP is set.
282  </P>  </P>
283  <P>  <P>
284  9. Case-insensitive matching applies only to characters whose values are less  9. Case-insensitive matching applies only to characters whose values are less
# Line 298  two digits 10, at the domain cam.ac.uk. Line 307  two digits 10, at the domain cam.ac.uk.
307  </P>  </P>
308  <br><a name="SEC6" href="#TOC1">REVISION</a><br>  <br><a name="SEC6" href="#TOC1">REVISION</a><br>
309  <P>  <P>
310  Last updated: 01 March 2010  Last updated: 12 May 2010
311  <br>  <br>
312  Copyright &copy; 1997-2010 University of Cambridge.  Copyright &copy; 1997-2010 University of Cambridge.
313  <br>  <br>

Legend:
Removed from v.517  
changed lines
  Added in v.518

  ViewVC Help
Powered by ViewVC 1.1.5