/[pcre]/code/trunk/doc/html/pcreunicode.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcreunicode.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1297 by ph10, Sun Nov 11 20:27:03 2012 UTC revision 1298 by ph10, Fri Mar 22 16:13:13 2013 UTC
# Line 85  place. From release 7.3 of PCRE, the che Line 85  place. From release 7.3 of PCRE, the che
85  which are themselves derived from the Unicode specification. Earlier releases  which are themselves derived from the Unicode specification. Earlier releases
86  of PCRE followed the rules of RFC 2279, which allows the full range of 31-bit  of PCRE followed the rules of RFC 2279, which allows the full range of 31-bit
87  values (0 to 0x7FFFFFFF). The current check allows only values in the range U+0  values (0 to 0x7FFFFFFF). The current check allows only values in the range U+0
88  to U+10FFFF, excluding the surrogate area and the non-characters.  to U+10FFFF, excluding the surrogate area. (From release 8.33 the so-called
89    "non-character" code points are no longer excluded because Unicode corrigendum
90    #9 makes it clear that they should not be.)
91  </P>  </P>
92  <P>  <P>
93  Characters in the "Surrogate Area" of Unicode are reserved for use by UTF-16,  Characters in the "Surrogate Area" of Unicode are reserved for use by UTF-16,
# Line 96  surrogate thing is a fudge for UTF-16 wh Line 98  surrogate thing is a fudge for UTF-16 wh
98  UTF-32.)  UTF-32.)
99  </P>  </P>
100  <P>  <P>
 Also excluded are the "Non-Character" code points, which are U+FDD0 to U+FDEF  
 and the last two code points in each plane, U+??FFFE and U+??FFFF.  
 </P>  
 <P>  
101  If an invalid UTF-8 string is passed to PCRE, an error return is given. At  If an invalid UTF-8 string is passed to PCRE, an error return is given. At
102  compile time, the only additional information is the offset to the first byte  compile time, the only additional information is the offset to the first byte
103  of the failing character. The run-time functions <b>pcre_exec()</b> and  of the failing character. The run-time functions <b>pcre_exec()</b> and
# Line 135  U+D800 to U+DFFF are independent code po Line 133  U+D800 to U+DFFF are independent code po
133  must be used in pairs in the correct manner.  must be used in pairs in the correct manner.
134  </P>  </P>
135  <P>  <P>
 Excluded are the "Non-Character" code points, which are U+FDD0 to U+FDEF  
 and the last two code points in each plane, U+??FFFE and U+??FFFF.  
 </P>  
 <P>  
136  If an invalid UTF-16 string is passed to PCRE, an error return is given. At  If an invalid UTF-16 string is passed to PCRE, an error return is given. At
137  compile time, the only additional information is the offset to the first data  compile time, the only additional information is the offset to the first data
138  unit of the failing character. The run-time functions <b>pcre16_exec()</b> and  unit of the failing character. The run-time functions <b>pcre16_exec()</b> and
# Line 160  Validity of UTF-32 strings Line 154  Validity of UTF-32 strings
154  When you set the PCRE_UTF32 flag, the strings of 32-bit data units that are  When you set the PCRE_UTF32 flag, the strings of 32-bit data units that are
155  passed as patterns and subjects are (by default) checked for validity on entry  passed as patterns and subjects are (by default) checked for validity on entry
156  to the relevant functions.  This check allows only values in the range U+0  to the relevant functions.  This check allows only values in the range U+0
157  to U+10FFFF, excluding the surrogate area U+D800 to U+DFFF, and the  to U+10FFFF, excluding the surrogate area U+D800 to U+DFFF.
 "Non-Character" code points, which are U+FDD0 to U+FDEF and the last two  
 characters in each plane, U+??FFFE and U+??FFFF.  
158  </P>  </P>
159  <P>  <P>
160  If an invalid UTF-32 string is passed to PCRE, an error return is given. At  If an invalid UTF-32 string is passed to PCRE, an error return is given. At
# Line 261  Cambridge CB2 3QH, England. Line 253  Cambridge CB2 3QH, England.
253  REVISION  REVISION
254  </b><br>  </b><br>
255  <P>  <P>
256  Last updated: 11 November 2012  Last updated: 27 February 2013
257  <br>  <br>
258  Copyright &copy; 1997-2012 University of Cambridge.  Copyright &copy; 1997-2013 University of Cambridge.
259  <br>  <br>
260  <p>  <p>
261  Return to the <a href="index.html">PCRE index page</a>.  Return to the <a href="index.html">PCRE index page</a>.

Legend:
Removed from v.1297  
changed lines
  Added in v.1298

  ViewVC Help
Powered by ViewVC 1.1.5