/[pcre]/code/trunk/maint/README
ViewVC logotype

Diff of /code/trunk/maint/README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 181 by ph10, Wed Jun 13 14:55:18 2007 UTC revision 350 by ph10, Wed Jul 2 19:18:41 2008 UTC
# Line 16  also contains some notes for maintainers Line 16  also contains some notes for maintainers
16  Files in the maint directory  Files in the maint directory
17  ----------------------------  ----------------------------
18    
19    ----------------- This file is now OBSOLETE and no longer used ----------------
20  Builducptable    A Perl script that creates the contents of the ucptable.h file  Builducptable    A Perl script that creates the contents of the ucptable.h file
21                   from two Unicode data files, which themselves are downloaded                   from two Unicode data files, which themselves are downloaded
22                   from the Unicode web site. Run this script in the "maint"                   from the Unicode web site. Run this script in the "maint"
23                   directory.                   directory.
24    ----------------- This file is now OBSOLETE and no longer used ----------------
25    
26  ManyConfigTests  A shell script that runs "configure, make, test" a number of  ManyConfigTests  A shell script that runs "configure, make, test" a number of
27                   times with different configuration settings.                   times with different configuration settings.
28    
29  Unicode.tables   The files in this directory, Scripts.txt and UnicodeData.txt,  MultiStage2.py   A Python script that generates the file pcre_ucd.c from three
30                   were downloaded from the Unicode web site. They contain                   Unicode data tables, which are themselves downloaded from the
31                   information about Unicode characters and scripts.                   Unicode web site. Run this script in the "maint" directory.
32                     The generated file contains the tables for a 2-stage lookup
33  ucptest.c        A short C program for testing the Unicode property functions                   of Unicode properties.
34                   in pcre_ucp_searchfuncs.c, mainly useful after rebuilding the  
35                   Unicode property table. Compile and run this in the "maint"  Unicode.tables   The files in this directory, DerivedGeneralCategory.txt,
36                   directory.                   Scripts.txt and UnicodeData.txt, were downloaded from the
37                     Unicode web site. They contain information about Unicode
38                     characters and scripts.
39    
40    ucptest.c        A short C program for testing the Unicode property macros
41                     that do lookups in the pcre_ucd.c data, mainly useful after
42                     rebuilding the Unicode property table. Compile and run this in
43                     the "maint" directory.
44    
45  ucptestdata      A directory containing two files, testinput1 and testoutput1,  ucptestdata      A directory containing two files, testinput1 and testoutput1,
46                   to use in conjunction with the ucptest program.                   to use in conjunction with the ucptest program.
# Line 49  Updating to a new Unicode release Line 58  Updating to a new Unicode release
58  ---------------------------------  ---------------------------------
59    
60  When there is a new release of Unicode, the files in Unicode.tables must be  When there is a new release of Unicode, the files in Unicode.tables must be
61  refreshed from the web site, and the Buildupctable script can then be run to  refreshed from the web site, and the MultiStage2.py script can then be run to
62  generate a new version of ucptable.h. The ucptest program can be used to check  generate a new version of pcre_ucd.c. The ucptest program can be used to check
63  that the resulting table works properly, using the data files in ucptestdata to  that the resulting table works properly, using the data files in ucptestdata to
64  check a number of test characters.  check a number of test characters.
65    
# Line 63  distribution for a new release. Line 72  distribution for a new release.
72    
73  . Ensure that the version number and version date are correct in configure.ac,  . Ensure that the version number and version date are correct in configure.ac,
74    ChangeLog, and NEWS.    ChangeLog, and NEWS.
75    
76    . If new build options have been added, ensure that they are added to the CMake
77      files as well as to the autoconf files.
78    
79  . Run ./autogen.sh to ensure everything is up-to-date.  . Run ./autogen.sh to ensure everything is up-to-date.
80    
# Line 113  Making a PCRE release Line 125  Making a PCRE release
125    
126  Run PrepareRelease and commit the files that it changes (by removing trailing  Run PrepareRelease and commit the files that it changes (by removing trailing
127  spaces). Then run "make distcheck" to create the tarballs and the zipball.  spaces). Then run "make distcheck" to create the tarballs and the zipball.
128    Double-check with "svn status", then create an SVN tagged copy:
129    
130      svn copy svn://vcs.exim.org/pcre/code/trunk \
131               svn://vcs.exim.org/pcre/code/tags/pcre-7.x
132    
133  Don't forget to update Freshmeat when the new release is out, and to tell  Don't forget to update Freshmeat when the new release is out, and to tell
134  webmaster@pcre.org and the mailing list.  webmaster@pcre.org and the mailing list.
# Line 217  others are relatively new. Line 233  others are relatively new.
233    to switch this dynamically. It would have to be specified when PCRE was    to switch this dynamically. It would have to be specified when PCRE was
234    compiled. PCRE would then call a function every time it wanted a character.    compiled. PCRE would then call a function every time it wanted a character.
235    
 . There are new (*PRUNE) facilities in Perl 5.10, some of which it might be  
   relatively easy to implement.  
   
236  . Wild thought: the ability to compile from PCRE's internal byte code to a real  . Wild thought: the ability to compile from PCRE's internal byte code to a real
237    FSM and a very fast (third) matcher to process the result. There would be    FSM and a very fast (third) matcher to process the result. There would be
238    even more restrictions than for pcre_dfa_exec(), however. This is not easy.    even more restrictions than for pcre_dfa_exec(), however. This is not easy.
# Line 233  others are relatively new. Line 246  others are relatively new.
246    
247  . Someone suggested --disable-callout to save code space when callouts are  . Someone suggested --disable-callout to save code space when callouts are
248    never wanted. This seems rather marginal.    never wanted. This seems rather marginal.
249    
250  . "Cut" as described in Jeffrey Friedl's book, p364: \v and \V. The definitions  . Check names that consist entirely of digits: PCRE allows, but do Perl and
251    aren't yet clear enough for me. \v flushes saved states so that no    Python, etc?
   backtracking to anything earlier can happen; \V says "no more bumpalong", but  
   does it fail the current match? As described in the book, these aren't really  
   "cut" as in Prolog, are they? NOTE: (a) PCRE once had "cut", but it was  
   removed when atomic groups were introduced. (b) Perl 5.10 has some (*PRUNE)  
   features --  
   
 . These are the Perl 5.10 backtracking control features (all of which are  
   described as "experimental" -- some of them "very experimental") that it  
   might be easy to add to PCRE. They all succeed when encountered, but act as  
   follows when backtracking:  
   
   (*PRUNE)  fail this match attempt, but still bumpalong  
   (*SKIP)   fail this match attempt, bumpalong to current match point  
   (*THEN)   fail this branch, try next branch at same level or fail if none  
   (*COMMIT) fail this match attempt, suppress bumpalong  
   (*FAIL)   fail and backtrack (same as (?!) and that can be optimized)  
   (*F)      synonym for (*FAIL)  
   (*ACCEPT) behave as if end of pattern reached ("very experimental")  
   
   Some of these can have arguments (*PRUNE:NAME) but I'm not sure whether they  
   make sense in the PCRE context.  
252    
253  Philip Hazel  Philip Hazel
254  Email local part: ph10  Email local part: ph10
255  Email domain: cam.ac.uk  Email domain: cam.ac.uk
256  Last updated: 13 June 2007  Last updated: 02 July 2008

Legend:
Removed from v.181  
changed lines
  Added in v.350

  ViewVC Help
Powered by ViewVC 1.1.5