/[pcre]/code/trunk/doc/pcrepattern.3
ViewVC logotype

Diff of /code/trunk/doc/pcrepattern.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1403 by ph10, Tue Nov 12 17:44:07 2013 UTC revision 1404 by ph10, Tue Nov 19 15:36:57 2013 UTC
# Line 90  table. Line 90  table.
90  .SS "Disabling auto-possessification"  .SS "Disabling auto-possessification"
91  .rs  .rs
92  .sp  .sp
93  If a pattern starts with (*NO_AUTO_POSSESS), it has the same effect as setting  If a pattern starts with (*NO_AUTO_POSSESS), it has the same effect as setting
94  the PCRE_NO_AUTO_POSSESS option at compile time. This stops PCRE from making  the PCRE_NO_AUTO_POSSESS option at compile time. This stops PCRE from making
95  quantifiers possessive when what follows cannot match the repeated item. For  quantifiers possessive when what follows cannot match the repeated item. For
96  example, by default a+b is treated as a++b. For more details, see the  example, by default a+b is treated as a++b. For more details, see the
# Line 317  one of the following escape sequences th Line 317  one of the following escape sequences th
317    \en        linefeed (hex 0A)    \en        linefeed (hex 0A)
318    \er        carriage return (hex 0D)    \er        carriage return (hex 0D)
319    \et        tab (hex 09)    \et        tab (hex 09)
320    \e0dd      character with octal code 0dd    \e0dd      character with octal code 0dd
321    \eddd      character with octal code ddd, or back reference    \eddd      character with octal code ddd, or back reference
322    \eo{ddd..} character with octal code ddd..    \eo{ddd..} character with octal code ddd..
323    \exhh      character with hex code hh    \exhh      character with hex code hh
324    \ex{hhh..} character with hex code hhh.. (non-JavaScript mode)    \ex{hhh..} character with hex code hhh.. (non-JavaScript mode)
325    \euhhhh    character with hex code hhhh (JavaScript mode only)    \euhhhh    character with hex code hhhh (JavaScript mode only)
# Line 346  specifies two binary zeros followed by a Line 346  specifies two binary zeros followed by a
346  sure you supply two digits after the initial zero if the pattern character that  sure you supply two digits after the initial zero if the pattern character that
347  follows is itself an octal digit.  follows is itself an octal digit.
348  .P  .P
349  The escape \eo must be followed by a sequence of octal digits, enclosed in  The escape \eo must be followed by a sequence of octal digits, enclosed in
350  braces. An error occurs if this is not the case. This escape is a recent  braces. An error occurs if this is not the case. This escape is a recent
351  addition to Perl; it provides way of specifying character code points as octal  addition to Perl; it provides way of specifying character code points as octal
352  numbers greater than 0777, and it also allows octal numbers and back references  numbers greater than 0777, and it also allows octal numbers and back references
# Line 435  limited to certain values, as follows: Line 435  limited to certain values, as follows:
435    32-bit UTF-32 mode    less than 0x10ffff and a valid codepoint    32-bit UTF-32 mode    less than 0x10ffff and a valid codepoint
436  .sp  .sp
437  Invalid Unicode codepoints are the range 0xd800 to 0xdfff (the so-called  Invalid Unicode codepoints are the range 0xd800 to 0xdfff (the so-called
438  "surrogate" codepoints), and 0xffef.  "surrogate" codepoints), and 0xffef.
439  .  .
440  .  .
441  .SS "Escape sequences in character classes"  .SS "Escape sequences in character classes"
# Line 535  For compatibility with Perl, \es did not Line 535  For compatibility with Perl, \es did not
535  11), which made it different from the the POSIX "space" class. However, Perl  11), which made it different from the the POSIX "space" class. However, Perl
536  added VT at release 5.18, and PCRE followed suit at release 8.34. The default  added VT at release 5.18, and PCRE followed suit at release 8.34. The default
537  \es characters are now HT (9), LF (10), VT (11), FF (12), CR (13), and space  \es characters are now HT (9), LF (10), VT (11), FF (12), CR (13), and space
538  (32), which are defined as white space in the "C" locale. This list may vary if  (32), which are defined as white space in the "C" locale. This list may vary if
539  locale-specific matching is taking place; in particular, in some locales the  locale-specific matching is taking place; in particular, in some locales the
540  "non-breaking space" character (\exA0) is recognized as white space.  "non-breaking space" character (\exA0) is recognized as white space.
541  .P  .P
542  A "word" character is an underscore or any character that is a letter or digit.  A "word" character is an underscore or any character that is a letter or digit.
# Line 1257  The minus (hyphen) character can be used Line 1257  The minus (hyphen) character can be used
1257  character class. For example, [d-m] matches any letter between d and m,  character class. For example, [d-m] matches any letter between d and m,
1258  inclusive. If a minus character is required in a class, it must be escaped with  inclusive. If a minus character is required in a class, it must be escaped with
1259  a backslash or appear in a position where it cannot be interpreted as  a backslash or appear in a position where it cannot be interpreted as
1260  indicating a range, typically as the first or last character in the class, or  indicating a range, typically as the first or last character in the class, or
1261  immediately after a range. For example, [b-d-z] matches letters in the range b  immediately after a range. For example, [b-d-z] matches letters in the range b
1262  to d, a hyphen character, or z.  to d, a hyphen character, or z.
1263  .P  .P
# Line 1376  other sequences, as follows: Line 1376  other sequences, as follows:
1376    [:upper:]  becomes  \ep{Lu}    [:upper:]  becomes  \ep{Lu}
1377    [:word:]   becomes  \ep{Xwd}    [:word:]   becomes  \ep{Xwd}
1378  .sp  .sp
1379  Negated versions, such as [:^alpha:] use \eP instead of \ep. Three other POSIX  Negated versions, such as [:^alpha:] use \eP instead of \ep. Three other POSIX
1380  classes are handled specially in UCP mode:  classes are handled specially in UCP mode:
1381  .TP 10  .TP 10
1382  [:graph:]  [:graph:]
1383  This matches characters that have glyphs that mark the page when printed. In  This matches characters that have glyphs that mark the page when printed. In
1384  Unicode property terms, it matches all characters with the L, M, N, P, S, or Cf  Unicode property terms, it matches all characters with the L, M, N, P, S, or Cf
1385  properties, except for:  properties, except for:
1386  .sp  .sp
1387    U+061C           Arabic Letter Mark    U+061C           Arabic Letter Mark
1388    U+180E           Mongolian Vowel Separator    U+180E           Mongolian Vowel Separator
1389    U+2066 - U+2069  Various "isolate"s    U+2066 - U+2069  Various "isolate"s
1390  .sp  .sp
1391  .TP 10  .TP 10
1392  [:print:]  [:print:]
1393  This matches the same characters as [:graph:] plus space characters that are  This matches the same characters as [:graph:] plus space characters that are
1394  not controls, that is, characters with the Zs property.  not controls, that is, characters with the Zs property.
1395  .TP 10  .TP 10
1396  [:punct:]  [:punct:]
# Line 1619  conditions, Line 1619  conditions,
1619  .\"  .\"
1620  can be made by name as well as by number.  can be made by name as well as by number.
1621  .P  .P
1622  Names consist of up to 32 alphanumeric characters and underscores, but must  Names consist of up to 32 alphanumeric characters and underscores, but must
1623  start with a non-digit. Named capturing parentheses are still allocated numbers  start with a non-digit. Named capturing parentheses are still allocated numbers
1624  as well as names, exactly as if the names were not present. The PCRE API  as well as names, exactly as if the names were not present. The PCRE API
1625  provides function calls for extracting the name-to-number translation table  provides function calls for extracting the name-to-number translation table
# Line 1650  for the first (and in this example, the Line 1650  for the first (and in this example, the
1650  matched. This saves searching to find which numbered subpattern it was.  matched. This saves searching to find which numbered subpattern it was.
1651  .P  .P
1652  If you make a back reference to a non-unique named subpattern from elsewhere in  If you make a back reference to a non-unique named subpattern from elsewhere in
1653  the pattern, the subpatterns to which the name refers are checked in the order  the pattern, the subpatterns to which the name refers are checked in the order
1654  in which they appear in the overall pattern. The first one that is set is used  in which they appear in the overall pattern. The first one that is set is used
1655  for the reference. For example, this pattern matches both "foofoo" and  for the reference. For example, this pattern matches both "foofoo" and
1656  "barbar" but not "foobar" or "barfoo":  "barbar" but not "foobar" or "barfoo":
1657  .sp  .sp
1658    (?:(?<n>foo)|(?<n>bar))\k<n>    (?:(?<n>foo)|(?<n>bar))\ek<n>
1659  .sp  .sp
1660  .P  .P
1661  If you make a subroutine call to a non-unique named subpattern, the one that  If you make a subroutine call to a non-unique named subpattern, the one that
# Line 2356  This makes the fragment independent of t Line 2356  This makes the fragment independent of t
2356  .sp  .sp
2357  Perl uses the syntax (?(<name>)...) or (?('name')...) to test for a used  Perl uses the syntax (?(<name>)...) or (?('name')...) to test for a used
2358  subpattern by name. For compatibility with earlier versions of PCRE, which had  subpattern by name. For compatibility with earlier versions of PCRE, which had
2359  this facility before Perl, the syntax (?(name)...) is also recognized.  this facility before Perl, the syntax (?(name)...) is also recognized.
2360  .P  .P
2361  Rewriting the above example to use a named subpattern gives this:  Rewriting the above example to use a named subpattern gives this:
2362  .sp  .sp

Legend:
Removed from v.1403  
changed lines
  Added in v.1404

  ViewVC Help
Powered by ViewVC 1.1.5