--- code/trunk/doc/html/pcrecallout.html 2007/02/24 21:40:24 71 +++ code/trunk/doc/html/pcrecallout.html 2007/03/06 12:27:42 99 @@ -3,12 +3,22 @@ pcrecallout specification -This HTML document has been generated automatically from the original man page. -If there is any nonsense in it, please consult the man page, in case the -conversion went wrong.
+

pcrecallout man page

+

+Return to the PCRE index page. +

+

+This page is part of the PCRE HTML documentation. It was generated automatically +from the original man page. If there is any nonsense in it, please consult the +man page, in case the conversion went wrong. +


PCRE CALLOUTS

@@ -26,18 +36,48 @@ function is to be called. Different callout points can be identified by putting a number less than 256 after the letter C. The default value is zero. For example, this pattern has two callout points: -

-

-  (?C1)\dabc(?C2)def
-
-

-

-During matching, when PCRE reaches a callout point (and pcre_callout is -set), the external function is called. Its only argument is a pointer to a -pcre_callout block. This contains the following variables: + (?C1)\deabc(?C2)def + +If the PCRE_AUTO_CALLOUT option bit is set when pcre_compile() is called, +PCRE automatically inserts callouts, all with number 255, before each item in +the pattern. For example, if PCRE_AUTO_CALLOUT is used with the pattern +

+  A(\d{2}|--)
+
+it is processed as if it were +
+
+(?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255) +
+
+Notice that there is a callout before and after each parenthesis and +alternation bar. Automatic callouts can be used for tracking the progress of +pattern matching. The +pcretest +command has an option that sets automatic callouts; when it is used, the output +indicates how the pattern is matched. This is useful information when you are +trying to optimize the performance of a particular pattern.

+
MISSING CALLOUTS

+You should be aware that, because of optimizations in the way PCRE matches +patterns, callouts sometimes do not happen. For example, if the pattern is +

+  ab(?C4)cd
+
+PCRE knows that any matching string must contain the letter "d". If the subject +string is "abyz", the lack of "d" means that matching doesn't ever start, and +the callout is never reached. However, with "abyd", though the result is still +no match, the callout is obeyed. +

+
THE CALLOUT INTERFACE
+

+During matching, when PCRE reaches a callout point, the external function +defined by pcre_callout is called (if it is set). This applies to both +the pcre_exec() and the pcre_dfa_exec() matching functions. The +only argument to the callout function is a pointer to a pcre_callout +block. This structure contains the following fields:

   int          version;
   int          callout_number;
@@ -49,61 +89,89 @@
   int          capture_top;
   int          capture_last;
   void        *callout_data;
-
-

-

+ int pattern_position; + int next_item_length; + The version field is an integer containing the version number of the -block format. The current version is zero. The version number may change in -future if additional fields are added, but the intention is never to remove any -of the existing fields. +block format. The initial version was 0; the current version is 1. The version +number will change again in future if additional fields are added, but the +intention is never to remove any of the existing fields.

The callout_number field contains the number of the callout, as compiled -into the pattern (that is, the number after ?C). +into the pattern (that is, the number after ?C for manual callouts, and 255 for +automatically generated callouts).

The offset_vector field is a pointer to the vector of offsets that was -passed by the caller to pcre_exec(). The contents can be inspected in -order to extract substrings that have been matched so far, in the same way as -for extracting substrings after a match has completed. +passed by the caller to pcre_exec() or pcre_dfa_exec(). When +pcre_exec() is used, the contents can be inspected in order to extract +substrings that have been matched so far, in the same way as for extracting +substrings after a match has completed. For pcre_dfa_exec() this field is +not useful.

-The subject and subject_length fields contain copies the values +The subject and subject_length fields contain copies of the values that were passed to pcre_exec().

The start_match field contains the offset within the subject at which the current match attempt started. If the pattern is not anchored, the callout -function may be called several times for different starting points. +function may be called several times from the same point in the pattern for +different starting points in the subject.

The current_position field contains the offset within the subject of the current match pointer.

-The capture_top field contains one more than the number of the highest -numbered captured substring so far. If no substrings have been captured, -the value of capture_top is one. +When the pcre_exec() function is used, the capture_top field +contains one more than the number of the highest numbered captured substring so +far. If no substrings have been captured, the value of capture_top is +one. This is always the case when pcre_dfa_exec() is used, because it +does not support captured substrings.

The capture_last field contains the number of the most recently captured -substring. +substring. If no substrings have been captured, its value is -1. This is always +the case when pcre_dfa_exec() is used.

The callout_data field contains a value that is passed to -pcre_exec() by the caller specifically so that it can be passed back in -callouts. It is passed in the pcre_callout field of the pcre_extra -data structure. If no such data was passed, the value of callout_data in -a pcre_callout block is NULL. There is a description of the -pcre_extra structure in the pcreapi documentation. -

-
RETURN VALUES
-

-The callout function returns an integer. If the value is zero, matching -proceeds as normal. If the value is greater than zero, matching fails at the -current point, but backtracking to test other possibilities goes ahead, just as -if a lookahead assertion had failed. If the value is less than zero, the match -is abandoned, and pcre_exec() returns the value. +pcre_exec() or pcre_dfa_exec() specifically so that it can be +passed back in callouts. It is passed in the pcre_callout field of the +pcre_extra data structure. If no such data was passed, the value of +callout_data in a pcre_callout block is NULL. There is a +description of the pcre_extra structure in the +pcreapi +documentation. +

+

+The pattern_position field is present from version 1 of the +pcre_callout structure. It contains the offset to the next item to be +matched in the pattern string. +

+

+The next_item_length field is present from version 1 of the +pcre_callout structure. It contains the length of the next item to be +matched in the pattern string. When the callout immediately precedes an +alternation bar, a closing parenthesis, or the end of the pattern, the length +is zero. When the callout precedes an opening parenthesis, the length is that +of the entire subpattern. +

+

+The pattern_position and next_item_length fields are intended to +help in distinguishing between different automatic callouts, which all have the +same callout number. However, they are set for all callouts. +

+
RETURN VALUES
+

+The external callout function returns an integer to PCRE. If the value is zero, +matching proceeds as normal. If the value is greater than zero, matching fails +at the current point, but the testing of other matching possibilities goes +ahead, just as if a lookahead assertion had failed. If the value is less than +zero, the match is abandoned, and pcre_exec() (or pcre_dfa_exec()) +returns the negative value.

Negative values should normally be chosen from the set of PCRE_ERROR_xxx @@ -111,7 +179,21 @@ The error number PCRE_ERROR_CALLOUT is reserved for use by callout functions; it will never be used by PCRE itself.

+
AUTHOR

-Last updated: 21 January 2003 +Philip Hazel +
+University Computing Service +
+Cambridge CB2 3QH, England. +
+

+
REVISION
+

+Last updated: 06 March 2007 +
+Copyright © 1997-2007 University of Cambridge.
-Copyright © 1997-2003 University of Cambridge. +

+Return to the PCRE index page. +