--- code/branches/pcre16/doc/html/pcrejit.html 2011/12/12 12:15:17 800 +++ code/branches/pcre16/doc/html/pcrejit.html 2011/12/12 16:23:37 801 @@ -20,10 +20,11 @@
@@ -57,11 +58,17 @@ fails.
-A program can tell if JIT support is available by calling pcre_config() -with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available, and 0 -otherwise. However, a simple program does not need to check this in order to -use JIT. The API is implemented in a way that falls back to the ordinary PCRE -code if JIT is not available. +A program that is linked with PCRE 8.20 or later can tell if JIT support is +available by calling pcre_config() with the PCRE_CONFIG_JIT option. The +result is 1 when JIT is available, and 0 otherwise. However, a simple program +does not need to check this in order to use JIT. The API is implemented in a +way that falls back to the ordinary PCRE code if JIT is not available. ++
+If your program may sometimes be linked with versions of PCRE that are older +than 8.20, but you want to use JIT when it is available, you can test +the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such +as PCRE_CONFIG_JIT, for compile-time control of your code.
@@ -75,6 +82,21 @@ no longer needed instead of just freeing it yourself. This ensures that any JIT data is also freed. +For a program that may be linked with pre-8.20 versions of PCRE, you can insert +
+ #ifndef PCRE_STUDY_JIT_COMPILE + #define PCRE_STUDY_JIT_COMPILE 0 + #endif ++so that no option is passed to pcre_study(), and then use something like +this to free the study data: +
+ #ifdef PCRE_CONFIG_JIT + pcre_free_study(study_ptr); + #else + pcre_free(study_ptr); + #endif +In some circumstances you may need to call additional functions. These are described in the section entitled "Controlling the JIT stack" @@ -116,12 +138,8 @@
The unsupported pattern items are:
- \C match a single byte; not supported in UTF-8 mode + \C match a single byte; not supported in UTF-8 mode (?Cn) callouts - (?(<name>)... conditional test on setting of a named subpattern - (?(R)... conditional test on whole pattern recursion - (?(Rn)... conditional test on recursion, by number - (?(R&name)... conditional test on recursion, by name (*COMMIT) ) (*MARK) ) (*PRUNE) ) the backtracking control verbs @@ -167,7 +185,10 @@ By default, it uses 32K on the machine stack. However, some large or complicated patterns need more than this. The error PCRE_ERROR_JIT_STACKLIMIT is given when there is not enough stack. Three functions are provided for -managing blocks of memory for use as JIT stacks. +managing blocks of memory for use as JIT stacks. There is further discussion +about the use of JIT stacks in the section entitled +"JIT stack FAQ" +below.-
The pcre_jit_stack_alloc() function creates a JIT stack. Its arguments @@ -234,8 +255,86 @@ and pcre_assign_jit_stack() does nothing unless the extra argument is non-NULL and points to a pcre_extra block that is the result of a successful study with PCRE_STUDY_JIT_COMPILE. ++
JIT STACK FAQ
+(1) Why do we need JIT stacks? ++
+PCRE (and JIT) is a recursive, depth-first engine, so it needs a stack where +the local data of the current node is pushed before checking its child nodes. +Allocating real machine stack on some platforms is difficult. For example, the +stack chain needs to be updated every time if we extend the stack on PowerPC. +Although it is possible, its updating time overhead decreases performance. So +we do the recursion in memory. +
+(2) Why don't we simply allocate blocks of memory with malloc()? ++
+Modern operating systems have a nice feature: they can reserve an address space +instead of allocating memory. We can safely allocate memory pages inside this +address space, so the stack could grow without moving memory data (this is +important because of pointers). Thus we can allocate 1M address space, and use +only a single memory page (usually 4K) if that is enough. However, we can still +grow up to 1M anytime if needed. +
+(3) Who "owns" a JIT stack? ++
+The owner of the stack is the user program, not the JIT studied pattern or +anything else. The user program must ensure that if a stack is used by +pcre_exec(), (that is, it is assigned to the pattern currently running), +that stack must not be used by any other threads (to avoid overwriting the same +memory area). The best practice for multithreaded programs is to allocate a +stack for each thread, and return this stack through the JIT callback function. +
+(4) When should a JIT stack be freed? ++
+You can free a JIT stack at any time, as long as it will not be used by +pcre_exec() again. When you assign the stack to a pattern, only a pointer +is set. There is no reference counting or any other magic. You can free the +patterns and stacks in any order, anytime. Just do not call +pcre_exec() with a pattern pointing to an already freed stack, as that +will cause SEGFAULT. (Also, do not free a stack currently used by +pcre_exec() in another thread). You can also replace the stack for a +pattern at any time. You can even free the previous stack before assigning a +replacement. +
+(5) Should I allocate/free a stack every time before/after calling +pcre_exec()? ++
+No, because this is too costly in terms of resources. However, you could +implement some clever idea which release the stack if it is not used in let's +say two minutes. The JIT callback can help to achive this without keeping a +list of the currently JIT studied patterns. +
+(6) OK, the stack is for long term memory allocation. But what happens if a +pattern causes stack overflow with a stack of 1M? Is that 1M kept until the +stack is freed? ++
+Especially on embedded sytems, it might be a good idea to release +memory sometimes without freeing the stack. There is no API for this at the +moment. Probably a function call which returns with the currently allocated +memory for any stack and another which allows releasing memory (shrinking the +stack) would be a good idea if someone needs this. +
+(7) This is too much of a headache. Isn't there any better solution for JIT +stack handling? +-
+No, thanks to Windows. If POSIX threads were used everywhere, we could throw +out this complicated API.
This is a single-threaded example that specifies a JIT stack without using a callback. @@ -260,22 +359,22 @@
+Philip Hazel (FAQ by Zoltan Herczeg)
University Computing Service
Cambridge CB2 3QH, England.
-Last updated: 19 October 2011
+Last updated: 26 November 2011
Copyright © 1997-2011 University of Cambridge.