/[pcre]/code/trunk/doc/pcrejit.3
ViewVC logotype

Diff of /code/trunk/doc/pcrejit.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 683 by ph10, Tue Sep 6 10:37:15 2011 UTC revision 761 by ph10, Tue Nov 22 12:24:26 2011 UTC
# Line 28  JIT. The support is limited to the follo Line 28  JIT. The support is limited to the follo
28    ARM v5, v7, and Thumb2    ARM v5, v7, and Thumb2
29    Intel x86 32-bit and 64-bit    Intel x86 32-bit and 64-bit
30    MIPS 32-bit    MIPS 32-bit
31    Power PC 32-bit and 64-bit    Power PC 32-bit and 64-bit (experimental)
32  .sp  .sp
33  If --enable-jit is set on an unsupported platform, compilation fails.  The Power PC support is designated as experimental because it has not been
34    fully tested. If --enable-jit is set on an unsupported platform, compilation
35    fails.
36  .P  .P
37  A program can tell if JIT support is available by calling \fBpcre_config()\fP  A program can tell if JIT support is available by calling \fBpcre_config()\fP
38  with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available, and 0  with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available, and 0
# Line 47  You have to do two things to make use of Line 49  You have to do two things to make use of
49    (1) Call \fBpcre_study()\fP with the PCRE_STUDY_JIT_COMPILE option for    (1) Call \fBpcre_study()\fP with the PCRE_STUDY_JIT_COMPILE option for
50        each compiled pattern, and pass the resulting \fBpcre_extra\fP block to        each compiled pattern, and pass the resulting \fBpcre_extra\fP block to
51        \fBpcre_exec()\fP.        \fBpcre_exec()\fP.
52    .sp
53    (2) Use \fBpcre_free_study()\fP to free the \fBpcre_extra\fP block when it is    (2) Use \fBpcre_free_study()\fP to free the \fBpcre_extra\fP block when it is
54        no longer needed instead of just freeing it yourself. This        no longer needed instead of just freeing it yourself. This
55        ensures that any JIT data is also freed.        ensures that any JIT data is also freed.
56  .sp  .sp
57  In some circumstances you may need to call additional functions. These are  In some circumstances you may need to call additional functions. These are
# Line 75  interpretive code. Line 77  interpretive code.
77  If the JIT compiler finds an unsupported item, no JIT data is generated. You  If the JIT compiler finds an unsupported item, no JIT data is generated. You
78  can find out if JIT execution is available after studying a pattern by calling  can find out if JIT execution is available after studying a pattern by calling
79  \fBpcre_fullinfo()\fP with the PCRE_INFO_JIT option. A result of 1 means that  \fBpcre_fullinfo()\fP with the PCRE_INFO_JIT option. A result of 1 means that
80  JIT compilationw was successful. A result of 0 means that JIT support is not  JIT compilation was successful. A result of 0 means that JIT support is not
81  available, or the pattern was not studied with PCRE_STUDY_JIT_COMPILE, or the  available, or the pattern was not studied with PCRE_STUDY_JIT_COMPILE, or the
82  JIT compiler was not able to handle the pattern.  JIT compiler was not able to handle the pattern.
83    .P
84    Once a pattern has been studied, with or without JIT, it can be used as many
85    times as you like for matching different subject strings.
86  .  .
87  .  .
88  .SH "UNSUPPORTED OPTIONS AND PATTERN ITEMS"  .SH "UNSUPPORTED OPTIONS AND PATTERN ITEMS"
# Line 90  supported. Line 95  supported.
95  .P  .P
96  The unsupported pattern items are:  The unsupported pattern items are:
97  .sp  .sp
98    \eC            match a single byte, even in UTF-8 mode    \eC            match a single byte; not supported in UTF-8 mode
99    (?Cn)          callouts    (?Cn)          callouts
   (?(<name>)...  conditional test on setting of a named subpattern  
   (?(R)...       conditional test on whole pattern recursion  
   (?(Rn)...      conditional test on recursion, by number  
   (?(R&name)...  conditional test on recursion, by name  
100    (*COMMIT)      )    (*COMMIT)      )
101    (*MARK)        )    (*MARK)        )
102    (*PRUNE)       ) the backtracking control verbs    (*PRUNE)       ) the backtracking control verbs
# Line 131  execution. Line 132  execution.
132  .rs  .rs
133  .sp  .sp
134  The code that is generated by the JIT compiler is architecture-specific, and is  The code that is generated by the JIT compiler is architecture-specific, and is
135  also position dependent. For those reasons it cannot be saved and restored like  also position dependent. For those reasons it cannot be saved (in a file or
136  the bytecode and other data of a compiled pattern. You should be able run  database) and restored later like the bytecode and other data of a compiled
137  \fBpcre_study()\fP on a saved and restored pattern, and thereby recreate the  pattern. Saving and restoring compiled patterns is not something many people
138  JIT data, but because JIT compilation uses significant resources, it is  do. More detail about this facility is given in the
139  probably not worth doing this.  .\" HREF
140    \fBpcreprecompile\fP
141    .\"
142    documentation. It should be possible to run \fBpcre_study()\fP on a saved and
143    restored pattern, and thereby recreate the JIT data, but because JIT
144    compilation uses significant resources, it is probably not worth doing this;
145    you might as well recompile the original pattern.
146  .  .
147  .  .
148  .\" HTML <a name="stackcontrol"></a>  .\" HTML <a name="stackcontrol"></a>
# Line 146  When the compiled JIT code runs, it need Line 153  When the compiled JIT code runs, it need
153  By default, it uses 32K on the machine stack. However, some large or  By default, it uses 32K on the machine stack. However, some large or
154  complicated patterns need more than this. The error PCRE_ERROR_JIT_STACKLIMIT  complicated patterns need more than this. The error PCRE_ERROR_JIT_STACKLIMIT
155  is given when there is not enough stack. Three functions are provided for  is given when there is not enough stack. Three functions are provided for
156  managing blocks of memory for use as JIT stacks.  managing blocks of memory for use as JIT stacks. There is further discussion
157    about the use of JIT stacks in the section entitled
158    .\" HTML <a href="#stackcontrol">
159    .\" </a>
160    "JIT stack FAQ"
161    .\"
162    below.
163  .P  .P
164  The \fBpcre_jit_stack_alloc()\fP function creates a JIT stack. Its arguments  The \fBpcre_jit_stack_alloc()\fP function creates a JIT stack. Its arguments
165  are a starting size and a maximum size, and it returns an opaque value  are a starting size and a maximum size, and it returns a pointer to an opaque
166  of type \fBpcre_jit_stack\fP that represents a JIT stack, or NULL if there is  structure of type \fBpcre_jit_stack\fP, or NULL if there is an error. The
167  an error. The \fBpcre_jit_stack_free()\fP function can be used to free a stack  \fBpcre_jit_stack_free()\fP function can be used to free a stack that is no
168  that is no longer needed. (For the technically minded: the address space is  longer needed. (For the technically minded: the address space is allocated by
169  allocated by mmap or VirtualAlloc.)  mmap or VirtualAlloc.)
170  .P  .P
171  JIT uses far less memory for recursion than the interpretive code,  JIT uses far less memory for recursion than the interpretive code,
172  and a maximum stack size of 512K to 1M should be more than enough for any  and a maximum stack size of 512K to 1M should be more than enough for any
173  pattern.  pattern.
174  .P  .P
# Line 197  This is a suggestion for how a typical m Line 210  This is a suggestion for how a typical m
210  .sp  .sp
211    During thread initalization    During thread initalization
212      thread_local_var = pcre_jit_stack_alloc(...)      thread_local_var = pcre_jit_stack_alloc(...)
213    .sp
214    During thread exit    During thread exit
215      pcre_jit_stack_free(thread_local_var)      pcre_jit_stack_free(thread_local_var)
216    .sp
217    Use a one-line callback function    Use a one-line callback function
218      return thread_local_var      return thread_local_var
219  .sp  .sp
# Line 210  is non-NULL and points to a \fBpcre_extr Line 223  is non-NULL and points to a \fBpcre_extr
223  successful study with PCRE_STUDY_JIT_COMPILE.  successful study with PCRE_STUDY_JIT_COMPILE.
224  .  .
225  .  .
226    .\" HTML <a name="stackfaq"></a>
227    .SH "JIT STACK FAQ"
228    .rs
229    .sp
230    (1) Why do we need JIT stacks?
231    .sp
232    PCRE (and JIT) is a recursive, depth-first engine, so it needs a stack where
233    the local data of the current node is pushed before checking its child nodes.
234    Allocating real machine stack on some platforms is difficult. For example, the
235    stack chain needs to be updated every time if we extend the stack on PowerPC.
236    Although it is possible, its updating time overhead decreases performance. So
237    we do the recursion in memory.
238    .P
239    (2) Why don't we simply allocate blocks of memory with \fBmalloc()\fP?
240    .sp
241    Modern operating systems have a nice feature: they can reserve an address space
242    instead of allocating memory. We can safely allocate memory pages inside this
243    address space, so the stack could grow without moving memory data (this is
244    important because of pointers). Thus we can allocate 1M address space, and use
245    only a single memory page (usually 4K) if that is enough. However, we can still
246    grow up to 1M anytime if needed.
247    .P
248    (3) Who "owns" a JIT stack?
249    .sp
250    The owner of the stack is the user program, not the JIT studied pattern or
251    anything else. The user program must ensure that if a stack is used by
252    \fBpcre_exec()\fP, (that is, it is assigned to the pattern currently running),
253    that stack must not be used by any other threads (to avoid overwriting the same
254    memory area). The best practice for multithreaded programs is to allocate a
255    stack for each thread, and return this stack through the JIT callback function.
256    .P
257    (4) When should a JIT stack be freed?
258    .sp
259    You can free a JIT stack at any time, as long as it will not be used by
260    \fBpcre_exec()\fP again. When you assign the stack to a pattern, only a pointer
261    is set. There is no reference counting or any other magic. You can free the
262    patterns and stacks in any order, anytime. Just \fIdo not\fP call
263    \fBpcre_exec()\fP with a pattern pointing to an already freed stack, as that
264    will cause SEGFAULT. (Also, do not free a stack currently used by
265    \fBpcre_exec()\fP in another thread). You can also replace the stack for a
266    pattern at any time. You can even free the previous stack before assigning a
267    replacement.
268    .P
269    (5) Should I allocate/free a stack every time before/after calling
270    \fBpcre_exec()\fP?
271    .sp
272    No, because this is too costly in terms of resources. However, you could
273    implement some clever idea which release the stack if it is not used in let's
274    say two minutes. The JIT callback can help to achive this without keeping a
275    list of the currently JIT studied patterns.
276    .P
277    (6) OK, the stack is for long term memory allocation. But what happens if a
278    pattern causes stack overflow with a stack of 1M? Is that 1M kept until the
279    stack is freed?
280    .sp
281    Especially on embedded sytems, it might be a good idea to release
282    memory sometimes without freeing the stack. There is no API for this at the
283    moment. Probably a function call which returns with the currently allocated
284    memory for any stack and another which allows releasing memory (shrinking the
285    stack) would be a good idea if someone needs this.
286    .P
287    (7) This is too much of a headache. Isn't there any better solution for JIT
288    stack handling?
289    .sp
290    No, thanks to Windows. If POSIX threads were used everywhere, we could throw
291    out this complicated API.
292    .
293    .
294  .SH "EXAMPLE CODE"  .SH "EXAMPLE CODE"
295  .rs  .rs
296  .sp  .sp
297  This is a single-threaded example that specifies a JIT stack without using a  This is a single-threaded example that specifies a JIT stack without using a
298  callback.  callback.
299  .sp  .sp
300    int rc;    int rc;
301    int ovector[30];    int ovector[30];
# Line 232  callback. Line 313  callback.
313    /* Check results */    /* Check results */
314    pcre_free(re);    pcre_free(re);
315    pcre_free_study(extra);    pcre_free_study(extra);
316      pcre_jit_stack_free(jit_stack);
317  .sp  .sp
318  .  .
319  .  .
# Line 245  callback. Line 327  callback.
327  .rs  .rs
328  .sp  .sp
329  .nf  .nf
330  Philip Hazel  Philip Hazel (FAQ by Zoltan Herczeg)
331  University Computing Service  University Computing Service
332  Cambridge CB2 3QH, England.  Cambridge CB2 3QH, England.
333  .fi  .fi
# Line 255  Cambridge CB2 3QH, England. Line 337  Cambridge CB2 3QH, England.
337  .rs  .rs
338  .sp  .sp
339  .nf  .nf
340  Last updated: 06 September 2011  Last updated: 22 November 2011
341  Copyright (c) 1997-2011 University of Cambridge.  Copyright (c) 1997-2011 University of Cambridge.
342  .fi  .fi

Legend:
Removed from v.683  
changed lines
  Added in v.761

  ViewVC Help
Powered by ViewVC 1.1.5