958 |
the above patterns match "SUNDAY" as well as "Saturday". |
the above patterns match "SUNDAY" as well as "Saturday". |
959 |
. |
. |
960 |
. |
. |
961 |
|
.SH "DUPLICATE SUBPATTERN NUMBERS" |
962 |
|
.rs |
963 |
|
.sp |
964 |
|
Perl 5.10 introduced a feature whereby each alternative in a subpattern uses |
965 |
|
the same numbers for its capturing parentheses. Such a subpattern starts with |
966 |
|
(?| and is itself a non-capturing subpattern. For example, consider this |
967 |
|
pattern: |
968 |
|
.sp |
969 |
|
(?|(Sat)ur|(Sun))day |
970 |
|
.sp |
971 |
|
Because the two alternatives are inside a (?| group, both sets of capturing |
972 |
|
parentheses are numbered one. Thus, when the pattern matches, you can look |
973 |
|
at captured substring number one, whichever alternative matched. This construct |
974 |
|
is useful when you want to capture part, but not all, of one of a number of |
975 |
|
alternatives. Inside a (?| group, parentheses are numbered as usual, but the |
976 |
|
number is reset at the start of each branch. The numbers of any capturing |
977 |
|
buffers that follow the subpattern start after the highest number used in any |
978 |
|
branch. The following example is taken from the Perl documentation. |
979 |
|
The numbers underneath show in which buffer the captured content will be |
980 |
|
stored. |
981 |
|
.sp |
982 |
|
# before ---------------branch-reset----------- after |
983 |
|
/ ( a ) (?| x ( y ) z | (p (q) r) | (t) u (v) ) ( z ) /x |
984 |
|
# 1 2 2 3 2 3 4 |
985 |
|
.sp |
986 |
|
A backreference or a recursive call to a numbered subpattern always refers to |
987 |
|
the first one in the pattern with the given number. |
988 |
|
.P |
989 |
|
An alternative approach to using this "branch reset" feature is to use |
990 |
|
duplicate named subpatterns, as described in the next section. |
991 |
|
. |
992 |
|
. |
993 |
.SH "NAMED SUBPATTERNS" |
.SH "NAMED SUBPATTERNS" |
994 |
.rs |
.rs |
995 |
.sp |
.sp |
1039 |
(?<DN>Sat)(?:urday)? |
(?<DN>Sat)(?:urday)? |
1040 |
.sp |
.sp |
1041 |
There are five capturing substrings, but only one is ever set after a match. |
There are five capturing substrings, but only one is ever set after a match. |
1042 |
|
(An alternative way of solving this problem is to use a "branch reset" |
1043 |
|
subpattern, as described in the previous section.) |
1044 |
|
.P |
1045 |
The convenience function for extracting the data by name returns the substring |
The convenience function for extracting the data by name returns the substring |
1046 |
for the first (and in this example, the only) subpattern of that name that |
for the first (and in this example, the only) subpattern of that name that |
1047 |
matched. This saves searching to find which numbered subpattern it was. If you |
matched. This saves searching to find which numbered subpattern it was. If you |
1933 |
.rs |
.rs |
1934 |
.sp |
.sp |
1935 |
.nf |
.nf |
1936 |
Last updated: 29 May 2007 |
Last updated: 11 June 2007 |
1937 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2007 University of Cambridge. |
1938 |
.fi |
.fi |