371 |
.\" |
.\" |
372 |
page. |
page. |
373 |
|
|
374 |
|
PCRE_NO_UTF8_CHECK |
375 |
|
|
376 |
|
When PCRE_UTF8 is set, the validity of the pattern as a UTF-8 string is |
377 |
|
automatically checked. If an invalid UTF-8 sequence of bytes is found, |
378 |
|
\fBpcre_compile()\fR returns an error. If you already know that your pattern is |
379 |
|
valid, and you want to skip this check for performance reasons, you can set the |
380 |
|
PCRE_NO_UTF8_CHECK option. When it is set, the effect of passing an invalid |
381 |
|
UTF-8 string as a pattern is undefined. It may cause your program to crash. |
382 |
|
Note that there is a similar option for suppressing the checking of subject |
383 |
|
strings passed to \fBpcre_exec()\fR. |
384 |
|
|
385 |
|
|
386 |
.SH STUDYING A PATTERN |
.SH STUDYING A PATTERN |
387 |
.rs |
.rs |
388 |
.sp |
.sp |
710 |
or turned out to be anchored by virtue of its contents, it cannot be made |
or turned out to be anchored by virtue of its contents, it cannot be made |
711 |
unachored at matching time. |
unachored at matching time. |
712 |
|
|
713 |
|
When PCRE_UTF8 was set at compile time, the validity of the subject as a UTF-8 |
714 |
|
string is automatically checked. If an invalid UTF-8 sequence of bytes is |
715 |
|
found, \fBpcre_exec()\fR returns the error PCRE_ERROR_BADUTF8. If you already |
716 |
|
know that your subject is valid, and you want to skip this check for |
717 |
|
performance reasons, you can set the PCRE_NO_UTF8_CHECK option when calling |
718 |
|
\fBpcre_exec()\fR. When this option is set, the effect of passing an invalid |
719 |
|
UTF-8 string as a subject is undefined. It may cause your program to crash. |
720 |
|
|
721 |
There are also three further options that can be set only at matching time: |
There are also three further options that can be set only at matching time: |
722 |
|
|
723 |
PCRE_NOTBOL |
PCRE_NOTBOL |
892 |
use by callout functions that want to yield a distinctive error code. See the |
use by callout functions that want to yield a distinctive error code. See the |
893 |
\fBpcrecallout\fR documentation for details. |
\fBpcrecallout\fR documentation for details. |
894 |
|
|
895 |
|
PCRE_ERROR_BADUTF8 (-10) |
896 |
|
|
897 |
|
A string that contains an invalid UTF-8 byte sequence was passed as a subject. |
898 |
|
|
899 |
.SH EXTRACTING CAPTURED SUBSTRINGS BY NUMBER |
.SH EXTRACTING CAPTURED SUBSTRINGS BY NUMBER |
900 |
.rs |
.rs |
901 |
.sp |
.sp |
1035 |
appropriate. |
appropriate. |
1036 |
|
|
1037 |
.in 0 |
.in 0 |
1038 |
Last updated: 03 February 2003 |
Last updated: 20 August 2003 |
1039 |
.br |
.br |
1040 |
Copyright (c) 1997-2003 University of Cambridge. |
Copyright (c) 1997-2003 University of Cambridge. |