1376 |
</b><br> |
</b><br> |
1377 |
<P> |
<P> |
1378 |
The subject string is passed to <b>pcre_exec()</b> as a pointer in |
The subject string is passed to <b>pcre_exec()</b> as a pointer in |
1379 |
<i>subject</i>, a length in <i>length</i>, and a starting byte offset in |
<i>subject</i>, a length (in bytes) in <i>length</i>, and a starting byte offset |
1380 |
<i>startoffset</i>. In UTF-8 mode, the byte offset must point to the start of a |
in <i>startoffset</i>. In UTF-8 mode, the byte offset must point to the start of |
1381 |
UTF-8 character. Unlike the pattern string, the subject may contain binary zero |
a UTF-8 character. Unlike the pattern string, the subject may contain binary |
1382 |
bytes. When the starting offset is zero, the search for a match starts at the |
zero bytes. When the starting offset is zero, the search for a match starts at |
1383 |
beginning of the subject, and this is by far the most common case. |
the beginning of the subject, and this is by far the most common case. |
1384 |
</P> |
</P> |
1385 |
<P> |
<P> |
1386 |
A non-zero starting offset is useful when searching for another match in the |
A non-zero starting offset is useful when searching for another match in the |
1418 |
kinds of parenthesized subpattern that do not cause substrings to be captured. |
kinds of parenthesized subpattern that do not cause substrings to be captured. |
1419 |
</P> |
</P> |
1420 |
<P> |
<P> |
1421 |
Captured substrings are returned to the caller via a vector of integer offsets |
Captured substrings are returned to the caller via a vector of integers whose |
1422 |
whose address is passed in <i>ovector</i>. The number of elements in the vector |
address is passed in <i>ovector</i>. The number of elements in the vector is |
1423 |
is passed in <i>ovecsize</i>, which must be a non-negative number. <b>Note</b>: |
passed in <i>ovecsize</i>, which must be a non-negative number. <b>Note</b>: this |
1424 |
this argument is NOT the size of <i>ovector</i> in bytes. |
argument is NOT the size of <i>ovector</i> in bytes. |
1425 |
</P> |
</P> |
1426 |
<P> |
<P> |
1427 |
The first two-thirds of the vector is used to pass back captured substrings, |
The first two-thirds of the vector is used to pass back captured substrings, |
1428 |
each substring using a pair of integers. The remaining third of the vector is |
each substring using a pair of integers. The remaining third of the vector is |
1429 |
used as workspace by <b>pcre_exec()</b> while matching capturing subpatterns, |
used as workspace by <b>pcre_exec()</b> while matching capturing subpatterns, |
1430 |
and is not available for passing back information. The length passed in |
and is not available for passing back information. The number passed in |
1431 |
<i>ovecsize</i> should always be a multiple of three. If it is not, it is |
<i>ovecsize</i> should always be a multiple of three. If it is not, it is |
1432 |
rounded down. |
rounded down. |
1433 |
</P> |
</P> |
1434 |
<P> |
<P> |
1435 |
When a match is successful, information about captured substrings is returned |
When a match is successful, information about captured substrings is returned |
1436 |
in pairs of integers, starting at the beginning of <i>ovector</i>, and |
in pairs of integers, starting at the beginning of <i>ovector</i>, and |
1437 |
continuing up to two-thirds of its length at the most. The first element of a |
continuing up to two-thirds of its length at the most. The first element of |
1438 |
pair is set to the offset of the first character in a substring, and the second |
each pair is set to the byte offset of the first character in a substring, and |
1439 |
is set to the offset of the first character after the end of a substring. The |
the second is set to the byte offset of the first character after the end of a |
1440 |
first pair, <i>ovector[0]</i> and <i>ovector[1]</i>, identify the portion of the |
substring. <b>Note</b>: these values are always byte offsets, even in UTF-8 |
1441 |
subject string matched by the entire pattern. The next pair is used for the |
mode. They are not character counts. |
1442 |
first capturing subpattern, and so on. The value returned by <b>pcre_exec()</b> |
</P> |
1443 |
is one more than the highest numbered pair that has been set. For example, if |
<P> |
1444 |
two substrings have been captured, the returned value is 3. If there are no |
The first pair of integers, <i>ovector[0]</i> and <i>ovector[1]</i>, identify the |
1445 |
capturing subpatterns, the return value from a successful match is 1, |
portion of the subject string matched by the entire pattern. The next pair is |
1446 |
indicating that just the first pair of offsets has been set. |
used for the first capturing subpattern, and so on. The value returned by |
1447 |
|
<b>pcre_exec()</b> is one more than the highest numbered pair that has been set. |
1448 |
|
For example, if two substrings have been captured, the returned value is 3. If |
1449 |
|
there are no capturing subpatterns, the return value from a successful match is |
1450 |
|
1, indicating that just the first pair of offsets has been set. |
1451 |
</P> |
</P> |
1452 |
<P> |
<P> |
1453 |
If a capturing subpattern is matched repeatedly, it is the last portion of the |
If a capturing subpattern is matched repeatedly, it is the last portion of the |
1456 |
<P> |
<P> |
1457 |
If the vector is too small to hold all the captured substring offsets, it is |
If the vector is too small to hold all the captured substring offsets, it is |
1458 |
used as far as possible (up to two-thirds of its length), and the function |
used as far as possible (up to two-thirds of its length), and the function |
1459 |
returns a value of zero. In particular, if the substring offsets are not of |
returns a value of zero. If the substring offsets are not of interest, |
1460 |
interest, <b>pcre_exec()</b> may be called with <i>ovector</i> passed as NULL and |
<b>pcre_exec()</b> may be called with <i>ovector</i> passed as NULL and |
1461 |
<i>ovecsize</i> as zero. However, if the pattern contains back references and |
<i>ovecsize</i> as zero. However, if the pattern contains back references and |
1462 |
the <i>ovector</i> is not big enough to remember the related substrings, PCRE |
the <i>ovector</i> is not big enough to remember the related substrings, PCRE |
1463 |
has to get additional memory for use during matching. Thus it is usually |
has to get additional memory for use during matching. Thus it is usually |
1976 |
</P> |
</P> |
1977 |
<br><a name="SEC22" href="#TOC1">REVISION</a><br> |
<br><a name="SEC22" href="#TOC1">REVISION</a><br> |
1978 |
<P> |
<P> |
1979 |
Last updated: 12 April 2008 |
Last updated: 24 August 2008 |
1980 |
<br> |
<br> |
1981 |
Copyright © 1997-2008 University of Cambridge. |
Copyright © 1997-2008 University of Cambridge. |
1982 |
<br> |
<br> |