729 |
.SH "LOCALE SUPPORT" |
.SH "LOCALE SUPPORT" |
730 |
.rs |
.rs |
731 |
.sp |
.sp |
732 |
PCRE handles caseless matching, and determines whether characters are letters |
PCRE handles caseless matching, and determines whether characters are letters, |
733 |
digits, or whatever, by reference to a set of tables, indexed by character |
digits, or whatever, by reference to a set of tables, indexed by character |
734 |
value. When running in UTF-8 mode, this applies only to characters with codes |
value. When running in UTF-8 mode, this applies only to characters with codes |
735 |
less than 128. Higher-valued codes never match escapes such as \ew or \ed, but |
less than 128. Higher-valued codes never match escapes such as \ew or \ed, but |
736 |
can be tested with \ep if PCRE is built with Unicode character property |
can be tested with \ep if PCRE is built with Unicode character property |
737 |
support. The use of locales with Unicode is discouraged. |
support. The use of locales with Unicode is discouraged. If you are handling |
738 |
|
characters with codes greater than 128, you should either use UTF-8 and |
739 |
|
Unicode, or use locales, but not try to mix the two. |
740 |
.P |
.P |
741 |
An internal set of tables is created in the default C locale when PCRE is |
PCRE contains an internal set of tables that are used when the final argument |
742 |
built. This is used when the final argument of \fBpcre_compile()\fP is NULL, |
of \fBpcre_compile()\fP is NULL. These are sufficient for many applications. |
743 |
and is sufficient for many applications. An alternative set of tables can, |
Normally, the internal tables recognize only ASCII characters. However, when |
744 |
however, be supplied. These may be created in a different locale from the |
PCRE is built, it is possible to cause the internal tables to be rebuilt in the |
745 |
default. As more and more applications change to using Unicode, the need for |
default "C" locale of the local system, which may cause them to be different. |
746 |
this locale support is expected to die away. |
.P |
747 |
|
The internal tables can always be overridden by tables supplied by the |
748 |
|
application that calls PCRE. These may be created in a different locale from |
749 |
|
the default. As more and more applications change to using Unicode, the need |
750 |
|
for this locale support is expected to die away. |
751 |
.P |
.P |
752 |
External tables are built by calling the \fBpcre_maketables()\fP function, |
External tables are built by calling the \fBpcre_maketables()\fP function, |
753 |
which has no arguments, in the relevant locale. The result can then be passed |
which has no arguments, in the relevant locale. The result can then be passed |
760 |
tables = pcre_maketables(); |
tables = pcre_maketables(); |
761 |
re = pcre_compile(..., tables); |
re = pcre_compile(..., tables); |
762 |
.sp |
.sp |
763 |
|
The locale name "fr_FR" is used on Linux and other Unix-like systems; if you |
764 |
|
are using Windows, the name for the French locale is "french". |
765 |
|
.P |
766 |
When \fBpcre_maketables()\fP runs, the tables are built in memory that is |
When \fBpcre_maketables()\fP runs, the tables are built in memory that is |
767 |
obtained via \fBpcre_malloc\fP. It is the caller's responsibility to ensure |
obtained via \fBpcre_malloc\fP. It is the caller's responsibility to ensure |
768 |
that the memory containing the tables remains available for as long as it is |
that the memory containing the tables remains available for as long as it is |