/[pcre]/code/trunk/doc/pcregrep.txt
ViewVC logotype

Diff of /code/trunk/doc/pcregrep.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 653 by ph10, Sat Jan 15 11:31:39 2011 UTC revision 654 by ph10, Tue Aug 2 11:00:40 2011 UTC
# Line 49  DESCRIPTION Line 49  DESCRIPTION
49         What defines a line  boundary  is  controlled  by  the  -N  (--newline)         What defines a line  boundary  is  controlled  by  the  -N  (--newline)
50         option.         option.
51    
52         Patterns  are  limited  to  8K  or  BUFSIZ characters, whichever is the         The amount of memory used for buffering files that are being scanned is
53         greater.  BUFSIZ is defined in <stdio.h>. When there is more  than  one         controlled by a parameter that can be set by the --buffer-size  option.
54         pattern (specified by the use of -e and/or -f), each pattern is applied         The  default  value  for  this  parameter is specified when pcregrep is
55         to each line in the order in which they are defined,  except  that  all         built, with the default default being 20K.  A  block  of  memory  three
56         the -e patterns are tried before the -f patterns.         times  this  size  is used (to allow for buffering "before" and "after"
57           lines). An error occurs if a line overflows the buffer.
58    
59           Patterns are limited to 8K or BUFSIZ bytes, whichever is  the  greater.
60           BUFSIZ  is  defined  in  <stdio.h>. When there is more than one pattern
61           (specified by the use of -e and/or -f), each pattern is applied to each
62           line  in  the  order  in which they are defined, except that all the -e
63           patterns are tried before the -f patterns.
64    
65         By  default,  as soon as one pattern matches (or fails to match when -v         By default, as soon as one pattern matches (or fails to match  when  -v
66         is used), no further patterns are considered. However, if --colour  (or         is  used), no further patterns are considered. However, if --colour (or
67         --color) is used to colour the matching substrings, or if --only-match-         --color) is used to colour the matching substrings, or if --only-match-
68         ing, --file-offsets, or --line-offsets is used to output only the  part         ing,  --file-offsets, or --line-offsets is used to output only the part
69         of  the  line  that  matched (either shown literally, or as an offset),         of the line that matched (either shown literally,  or  as  an  offset),
70         scanning resumes immediately  following  the  match,  so  that  further         scanning  resumes  immediately  following  the  match,  so that further
71         matches  on the same line can be found. If there are multiple patterns,         matches on the same line can be found. If there are multiple  patterns,
72         they are all tried on the remainder of the line, but patterns that fol-         they are all tried on the remainder of the line, but patterns that fol-
73         low the one that matched are not tried on the earlier part of the line.         low the one that matched are not tried on the earlier part of the line.
74    
# Line 69  DESCRIPTION Line 76  DESCRIPTION
76         in which multiple patterns are specified can affect the output when one         in which multiple patterns are specified can affect the output when one
77         of the above options is used.         of the above options is used.
78    
79         Patterns  that can match an empty string are accepted, but empty string         Patterns that can match an empty string are accepted, but empty  string
80         matches   are   never   recognized.   An   example   is   the   pattern         matches   are   never   recognized.   An   example   is   the   pattern
81         "(super)?(man)?",  in  which  all components are optional. This pattern         "(super)?(man)?", in which all components are  optional.  This  pattern
82         finds all occurrences of both "super" and  "man";  the  output  differs         finds  all  occurrences  of  both "super" and "man"; the output differs
83         from  matching  with  "super|man" when only the matching substrings are         from matching with "super|man" when only the  matching  substrings  are
84         being shown.         being shown.
85    
86         If the LC_ALL or LC_CTYPE environment variable is  set,  pcregrep  uses         If  the  LC_ALL  or LC_CTYPE environment variable is set, pcregrep uses
87         the  value to set a locale when calling the PCRE library.  The --locale         the value to set a locale when calling the PCRE library.  The  --locale
88         option can be used to override this.         option can be used to override this.
89    
90    
91  SUPPORT FOR COMPRESSED FILES  SUPPORT FOR COMPRESSED FILES
92    
93         It is possible to compile pcregrep so that it uses libz  or  libbz2  to         It  is  possible  to compile pcregrep so that it uses libz or libbz2 to
94         read  files  whose names end in .gz or .bz2, respectively. You can find         read files whose names end in .gz or .bz2, respectively. You  can  find
95         out whether your binary has support for one or both of these file types         out whether your binary has support for one or both of these file types
96         by running it with the --help option. If the appropriate support is not         by running it with the --help option. If the appropriate support is not
97         present, files are treated as plain text. The standard input is  always         present,  files are treated as plain text. The standard input is always
98         so treated.         so treated.
99    
100    
101  OPTIONS  OPTIONS
102    
103         The  order  in  which some of the options appear can affect the output.         The order in which some of the options appear can  affect  the  output.
104         For example, both the -h and -l options affect  the  printing  of  file         For  example,  both  the  -h and -l options affect the printing of file
105         names.  Whichever  comes later in the command line will be the one that         names. Whichever comes later in the command line will be the  one  that
106         takes effect.         takes  effect.  Numerical values for options may be followed by K or M,
107           to signify multiplication by 1024 or 1024*1024 respectively.
108    
109         --        This terminate the list of options. It is useful if the  next         --        This terminates the list of options. It is useful if the next
110                   item  on  the command line starts with a hyphen but is not an                   item  on  the command line starts with a hyphen but is not an
111                   option. This allows for the processing of patterns and  file-                   option. This allows for the processing of patterns and  file-
112                   names that start with hyphens.                   names that start with hyphens.
# Line 123  OPTIONS Line 131  OPTIONS
131                   pcregrep guarantees to have up to 8K of preceding text avail-                   pcregrep guarantees to have up to 8K of preceding text avail-
132                   able for context output.                   able for context output.
133    
134           --buffer-size=number
135                     Set  the  parameter that controls how much memory is used for
136                     buffering files that are being scanned.
137    
138         -C number, --context=number         -C number, --context=number
139                   Output  number  lines  of  context both before and after each                   Output number lines of context both  before  and  after  each
140                   matching line.  This is equivalent to setting both -A and  -B                   matching  line.  This is equivalent to setting both -A and -B
141                   to the same value.                   to the same value.
142    
143         -c, --count         -c, --count
144                   Do  not output individual lines from the files that are being                   Do not output individual lines from the files that are  being
145                   scanned; instead output the number of lines that would other-                   scanned; instead output the number of lines that would other-
146                   wise  have  been  shown. If no lines are selected, the number                   wise have been shown. If no lines are  selected,  the  number
147                   zero is output. If several files are  are  being  scanned,  a                   zero  is  output.  If  several files are are being scanned, a
148                   count  is  output  for each of them. However, if the --files-                   count is output for each of them. However,  if  the  --files-
149                   with-matches option is also  used,  only  those  files  whose                   with-matches  option  is  also  used,  only those files whose
150                   counts are greater than zero are listed. When -c is used, the                   counts are greater than zero are listed. When -c is used, the
151                   -A, -B, and -C options are ignored.                   -A, -B, and -C options are ignored.
152    
153         --colour, --color         --colour, --color
154                   If this option is given without any data, it is equivalent to                   If this option is given without any data, it is equivalent to
155                   "--colour=auto".   If  data  is required, it must be given in                   "--colour=auto".  If data is required, it must  be  given  in
156                   the same shell item, separated by an equals sign.                   the same shell item, separated by an equals sign.
157    
158         --colour=value, --color=value         --colour=value, --color=value
159                   This option specifies under what circumstances the parts of a                   This option specifies under what circumstances the parts of a
160                   line that matched a pattern should be coloured in the output.                   line that matched a pattern should be coloured in the output.
161                   By default, the output is not coloured. The value  (which  is                   By  default,  the output is not coloured. The value (which is
162                   optional,  see above) may be "never", "always", or "auto". In                   optional, see above) may be "never", "always", or "auto".  In
163                   the latter case, colouring happens only if the standard  out-                   the  latter case, colouring happens only if the standard out-
164                   put  is connected to a terminal. More resources are used when                   put is connected to a terminal. More resources are used  when
165                   colouring is enabled, because pcregrep has to search for  all                   colouring  is enabled, because pcregrep has to search for all
166                   possible  matches in a line, not just one, in order to colour                   possible matches in a line, not just one, in order to  colour
167                   them all.                   them all.
168    
169                   The colour that is used can be specified by setting the envi-                   The colour that is used can be specified by setting the envi-
170                   ronment variable PCREGREP_COLOUR or PCREGREP_COLOR. The value                   ronment variable PCREGREP_COLOUR or PCREGREP_COLOR. The value
171                   of this variable should be a string of two numbers, separated                   of this variable should be a string of two numbers, separated
172                   by  a  semicolon.  They  are copied directly into the control                   by a semicolon. They are copied  directly  into  the  control
173                   string for setting colour  on  a  terminal,  so  it  is  your                   string  for  setting  colour  on  a  terminal,  so it is your
174                   responsibility  to ensure that they make sense. If neither of                   responsibility to ensure that they make sense. If neither  of
175                   the environment variables is  set,  the  default  is  "1;31",                   the  environment  variables  is  set,  the default is "1;31",
176                   which gives red.                   which gives red.
177    
178         -D action, --devices=action         -D action, --devices=action
179                   If  an  input  path  is  not  a  regular file or a directory,                   If an input path is  not  a  regular  file  or  a  directory,
180                   "action" specifies how it is to be  processed.  Valid  values                   "action"  specifies  how  it is to be processed. Valid values
181                   are "read" (the default) or "skip" (silently skip the path).                   are "read" (the default) or "skip" (silently skip the path).
182    
183         -d action, --directories=action         -d action, --directories=action
184                   If an input path is a directory, "action" specifies how it is                   If an input path is a directory, "action" specifies how it is
185                   to be processed.  Valid  values  are  "read"  (the  default),                   to  be  processed.   Valid  values  are "read" (the default),
186                   "recurse"  (equivalent to the -r option), or "skip" (silently                   "recurse" (equivalent to the -r option), or "skip"  (silently
187                   skip the path). In the default case, directories are read  as                   skip  the path). In the default case, directories are read as
188                   if  they  were  ordinary files. In some operating systems the                   if they were ordinary files. In some  operating  systems  the
189                   effect of reading a directory like this is an immediate  end-                   effect  of reading a directory like this is an immediate end-
190                   of-file.                   of-file.
191    
192         -e pattern, --regex=pattern, --regexp=pattern         -e pattern, --regex=pattern, --regexp=pattern
193                   Specify a pattern to be matched. This option can be used mul-                   Specify a pattern to be matched. This option can be used mul-
194                   tiple times in order to specify several patterns. It can also                   tiple times in order to specify several patterns. It can also
195                   be  used  as a way of specifying a single pattern that starts                   be used as a way of specifying a single pattern  that  starts
196                   with a hyphen. When -e is used, no argument pattern is  taken                   with  a hyphen. When -e is used, no argument pattern is taken
197                   from  the  command  line;  all  arguments are treated as file                   from the command line; all  arguments  are  treated  as  file
198                   names. There is an overall maximum of 100 patterns. They  are                   names.  There is an overall maximum of 100 patterns. They are
199                   applied  to  each line in the order in which they are defined                   applied to each line in the order in which they  are  defined
200                   until one matches (or fails to match if -v is used). If -f is                   until one matches (or fails to match if -v is used). If -f is
201                   used  with  -e,  the command line patterns are matched first,                   used with -e, the command line patterns  are  matched  first,
202                   followed by the patterns from the file,  independent  of  the                   followed  by  the  patterns from the file, independent of the
203                   order  in which these options are specified. Note that multi-                   order in which these options are specified. Note that  multi-
204                   ple use of -e is not the same as a single pattern with alter-                   ple use of -e is not the same as a single pattern with alter-
205                   natives. For example, X|Y finds the first character in a line                   natives. For example, X|Y finds the first character in a line
206                   that is X or Y, whereas if the two patterns are  given  sepa-                   that  is  X or Y, whereas if the two patterns are given sepa-
207                   rately, pcregrep finds X if it is present, even if it follows                   rately, pcregrep finds X if it is present, even if it follows
208                   Y in the line. It finds Y only if there is no X in the  line.                   Y  in the line. It finds Y only if there is no X in the line.
209                   This  really  matters  only  if  you are using -o to show the                   This really matters only if you are  using  -o  to  show  the
210                   part(s) of the line that matched.                   part(s) of the line that matched.
211    
212         --exclude=pattern         --exclude=pattern
213                   When pcregrep is searching the files in a directory as a con-                   When pcregrep is searching the files in a directory as a con-
214                   sequence  of  the  -r  (recursive search) option, any regular                   sequence of the -r (recursive  search)  option,  any  regular
215                   files whose names match the pattern are excluded. Subdirecto-                   files whose names match the pattern are excluded. Subdirecto-
216                   ries  are  not  excluded  by  this  option; they are searched                   ries are not excluded  by  this  option;  they  are  searched
217                   recursively, subject to the --exclude-dir  and  --include_dir                   recursively,  subject  to the --exclude-dir and --include_dir
218                   options.  The  pattern  is  a PCRE regular expression, and is                   options. The pattern is a PCRE  regular  expression,  and  is
219                   matched against the final component of the file name (not the                   matched against the final component of the file name (not the
220                   entire  path).  If  a  file  name  matches both --include and                   entire path). If a  file  name  matches  both  --include  and
221                   --exclude, it is excluded.  There is no short form  for  this                   --exclude,  it  is excluded.  There is no short form for this
222                   option.                   option.
223    
224         --exclude-dir=pattern         --exclude-dir=pattern
225                   When  pcregrep  is searching the contents of a directory as a                   When pcregrep is searching the contents of a directory  as  a
226                   consequence of the -r (recursive search) option,  any  subdi-                   consequence  of  the -r (recursive search) option, any subdi-
227                   rectories  whose  names match the pattern are excluded. (Note                   rectories whose names match the pattern are  excluded.  (Note
228                   that the --exclude option does  not  affect  subdirectories.)                   that  the  --exclude  option does not affect subdirectories.)
229                   The  pattern  is  a  PCRE  regular expression, and is matched                   The pattern is a PCRE  regular  expression,  and  is  matched
230                   against the final component  of  the  name  (not  the  entire                   against  the  final  component  of  the  name (not the entire
231                   path).  If a subdirectory name matches both --include-dir and                   path). If a subdirectory name matches both --include-dir  and
232                   --exclude-dir, it is excluded. There is  no  short  form  for                   --exclude-dir,  it  is  excluded.  There is no short form for
233                   this option.                   this option.
234    
235         -F, --fixed-strings         -F, --fixed-strings
236                   Interpret  each pattern as a list of fixed strings, separated                   Interpret each pattern as a list of fixed strings,  separated
237                   by newlines, instead of  as  a  regular  expression.  The  -w                   by  newlines,  instead  of  as  a  regular expression. The -w
238                   (match  as  a  word) and -x (match whole line) options can be                   (match as a word) and -x (match whole line)  options  can  be
239                   used with -F. They apply to each of the fixed strings. A line                   used with -F. They apply to each of the fixed strings. A line
240                   is selected if any of the fixed strings are found in it (sub-                   is selected if any of the fixed strings are found in it (sub-
241                   ject to -w or -x, if present).                   ject to -w or -x, if present).
242    
243         -f filename, --file=filename         -f filename, --file=filename
244                   Read a number of patterns from the file, one  per  line,  and                   Read  a  number  of patterns from the file, one per line, and
245                   match  them against each line of input. A data line is output                   match them against each line of input. A data line is  output
246                   if any of the patterns match it. The filename can be given as                   if any of the patterns match it. The filename can be given as
247                   "-" to refer to the standard input. When -f is used, patterns                   "-" to refer to the standard input. When -f is used, patterns
248                   specified on the command line using -e may also  be  present;                   specified  on  the command line using -e may also be present;
249                   they are tested before the file's patterns. However, no other                   they are tested before the file's patterns. However, no other
250                   pattern is taken from the command  line;  all  arguments  are                   pattern  is  taken  from  the command line; all arguments are
251                   treated  as  file  names.  There is an overall maximum of 100                   treated as file names. There is an  overall  maximum  of  100
252                   patterns. Trailing white space is removed from each line, and                   patterns. Trailing white space is removed from each line, and
253                   blank  lines  are ignored. An empty file contains no patterns                   blank lines are ignored. An empty file contains  no  patterns
254                   and therefore matches nothing. See also  the  comments  about                   and  therefore  matches  nothing. See also the comments about
255                   multiple  patterns  versus a single pattern with alternatives                   multiple patterns versus a single pattern  with  alternatives
256                   in the description of -e above.                   in the description of -e above.
257    
258         --file-offsets         --file-offsets
259                   Instead of showing lines or parts of lines that  match,  show                   Instead  of  showing lines or parts of lines that match, show
260                   each  match  as  an  offset  from the start of the file and a                   each match as an offset from the start  of  the  file  and  a
261                   length, separated by a comma. In this  mode,  no  context  is                   length,  separated  by  a  comma. In this mode, no context is
262                   shown.  That  is,  the -A, -B, and -C options are ignored. If                   shown. That is, the -A, -B, and -C options  are  ignored.  If
263                   there is more than one match in a line, each of them is shown                   there is more than one match in a line, each of them is shown
264                   separately.  This  option  is mutually exclusive with --line-                   separately. This option is mutually  exclusive  with  --line-
265                   offsets and --only-matching.                   offsets and --only-matching.
266    
267         -H, --with-filename         -H, --with-filename
268                   Force the inclusion of the filename at the  start  of  output                   Force  the  inclusion  of the filename at the start of output
269                   lines  when searching a single file. By default, the filename                   lines when searching a single file. By default, the  filename
270                   is not shown in this case. For matching lines,  the  filename                   is  not  shown in this case. For matching lines, the filename
271                   is followed by a colon; for context lines, a hyphen separator                   is followed by a colon; for context lines, a hyphen separator
272                   is used. If a line number is also being  output,  it  follows                   is  used.  If  a line number is also being output, it follows
273                   the file name.                   the file name.
274    
275         -h, --no-filename         -h, --no-filename
276                   Suppress  the output filenames when searching multiple files.                   Suppress the output filenames when searching multiple  files.
277                   By default, filenames  are  shown  when  multiple  files  are                   By  default,  filenames  are  shown  when  multiple files are
278                   searched.  For  matching lines, the filename is followed by a                   searched. For matching lines, the filename is followed  by  a
279                   colon; for context lines, a hyphen separator is used.   If  a                   colon;  for  context lines, a hyphen separator is used.  If a
280                   line number is also being output, it follows the file name.                   line number is also being output, it follows the file name.
281    
282         --help    Output  a  help  message, giving brief details of the command         --help    Output a help message, giving brief details  of  the  command
283                   options and file type support, and then exit.                   options and file type support, and then exit.
284    
285         -i, --ignore-case         -i, --ignore-case
# Line 277  OPTIONS Line 289  OPTIONS
289                   When pcregrep is searching the files in a directory as a con-                   When pcregrep is searching the files in a directory as a con-
290                   sequence of the -r (recursive search) option, only those reg-                   sequence of the -r (recursive search) option, only those reg-
291                   ular files whose names match the pattern are included. Subdi-                   ular files whose names match the pattern are included. Subdi-
292                   rectories  are always included and searched recursively, sub-                   rectories are always included and searched recursively,  sub-
293                   ject to the --include-dir and --exclude-dir options. The pat-                   ject to the --include-dir and --exclude-dir options. The pat-
294                   tern is a PCRE regular expression, and is matched against the                   tern is a PCRE regular expression, and is matched against the
295                   final component of the file name (not the entire path). If  a                   final  component of the file name (not the entire path). If a
296                   file  name  matches  both  --include  and  --exclude,  it  is                   file  name  matches  both  --include  and  --exclude,  it  is
297                   excluded. There is no short form for this option.                   excluded. There is no short form for this option.
298    
299         --include-dir=pattern         --include-dir=pattern
300                   When pcregrep is searching the contents of a directory  as  a                   When  pcregrep  is searching the contents of a directory as a
301                   consequence  of  the -r (recursive search) option, only those                   consequence of the -r (recursive search) option,  only  those
302                   subdirectories whose names match the  pattern  are  included.                   subdirectories  whose  names  match the pattern are included.
303                   (Note  that  the --include option does not affect subdirecto-                   (Note that the --include option does not  affect  subdirecto-
304                   ries.) The pattern is  a  PCRE  regular  expression,  and  is                   ries.)  The  pattern  is  a  PCRE  regular expression, and is
305                   matched  against  the  final  component  of the name (not the                   matched against the final component  of  the  name  (not  the
306                   entire path). If a subdirectory name matches both  --include-                   entire  path). If a subdirectory name matches both --include-
307                   dir and --exclude-dir, it is excluded. There is no short form                   dir and --exclude-dir, it is excluded. There is no short form
308                   for this option.                   for this option.
309    
310         -L, --files-without-match         -L, --files-without-match
311                   Instead of outputting lines from the files, just  output  the                   Instead  of  outputting lines from the files, just output the
312                   names  of  the files that do not contain any lines that would                   names of the files that do not contain any lines  that  would
313                   have been output. Each file name is output once, on  a  sepa-                   have  been  output. Each file name is output once, on a sepa-
314                   rate line.                   rate line.
315    
316         -l, --files-with-matches         -l, --files-with-matches
317                   Instead  of  outputting lines from the files, just output the                   Instead of outputting lines from the files, just  output  the
318                   names of the files containing lines that would have been out-                   names of the files containing lines that would have been out-
319                   put.  Each  file  name  is  output  once, on a separate line.                   put. Each file name is  output  once,  on  a  separate  line.
320                   Searching normally stops as soon as a matching line is  found                   Searching  normally stops as soon as a matching line is found
321                   in  a  file.  However, if the -c (count) option is also used,                   in a file. However, if the -c (count) option  is  also  used,
322                   matching continues in order to obtain the correct count,  and                   matching  continues in order to obtain the correct count, and
323                   those  files  that  have  at least one match are listed along                   those files that have at least one  match  are  listed  along
324                   with their counts. Using this option with -c is a way of sup-                   with their counts. Using this option with -c is a way of sup-
325                   pressing the listing of files with no matches.                   pressing the listing of files with no matches.
326    
# Line 318  OPTIONS Line 330  OPTIONS
330                   input)" is used. There is no short form for this option.                   input)" is used. There is no short form for this option.
331    
332         --line-buffered         --line-buffered
333                   When  this  option is given, input is read and processed line                   When this option is given, input is read and  processed  line
334                   by line, and the output  is  flushed  after  each  write.  By                   by  line,  and  the  output  is  flushed after each write. By
335                   default,  input  is read in large chunks, unless pcregrep can                   default, input is read in large chunks, unless  pcregrep  can
336                   determine that it is reading from a terminal (which  is  cur-                   determine  that  it is reading from a terminal (which is cur-
337                   rently  possible only in Unix environments). Output to termi-                   rently possible only in Unix environments). Output to  termi-
338                   nal is normally automatically flushed by the  operating  sys-                   nal  is  normally automatically flushed by the operating sys-
339                   tem.  This  option  can be useful when the input or output is                   tem. This option can be useful when the input  or  output  is
340                   attached to a pipe and you do not want pcregrep to buffer  up                   attached  to a pipe and you do not want pcregrep to buffer up
341                   large  amounts  of data. However, its use will affect perfor-                   large amounts of data. However, its use will  affect  perfor-
342                   mance, and the -M (multiline) option ceases to work.                   mance, and the -M (multiline) option ceases to work.
343    
344         --line-offsets         --line-offsets
345                   Instead of showing lines or parts of lines that  match,  show                   Instead  of  showing lines or parts of lines that match, show
346                   each match as a line number, the offset from the start of the                   each match as a line number, the offset from the start of the
347                   line, and a length. The line number is terminated by a  colon                   line,  and a length. The line number is terminated by a colon
348                   (as  usual; see the -n option), and the offset and length are                   (as usual; see the -n option), and the offset and length  are
349                   separated by a comma. In this  mode,  no  context  is  shown.                   separated  by  a  comma.  In  this mode, no context is shown.
350                   That  is, the -A, -B, and -C options are ignored. If there is                   That is, the -A, -B, and -C options are ignored. If there  is
351                   more than one match in a line, each of them  is  shown  sepa-                   more  than  one  match in a line, each of them is shown sepa-
352                   rately. This option is mutually exclusive with --file-offsets                   rately. This option is mutually exclusive with --file-offsets
353                   and --only-matching.                   and --only-matching.
354    
355         --locale=locale-name         --locale=locale-name
356                   This option specifies a locale to be used for pattern  match-                   This  option specifies a locale to be used for pattern match-
357                   ing.  It  overrides the value in the LC_ALL or LC_CTYPE envi-                   ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-
358                   ronment variables.  If  no  locale  is  specified,  the  PCRE                   ronment  variables.  If  no  locale  is  specified,  the PCRE
359                   library's  default (usually the "C" locale) is used. There is                   library's default (usually the "C" locale) is used. There  is
360                   no short form for this option.                   no short form for this option.
361    
362         --match-limit=number         --match-limit=number
363                   Processing some regular expression  patterns  can  require  a                   Processing  some  regular  expression  patterns can require a
364                   very  large amount of memory, leading in some cases to a pro-                   very large amount of memory, leading in some cases to a  pro-
365                   gram crash if not enough is available.   Other  patterns  may                   gram  crash  if  not enough is available.  Other patterns may
366                   take  a  very  long  time to search for all possible matching                   take a very long time to search  for  all  possible  matching
367                   strings. The pcre_exec() function that is called by  pcregrep                   strings.  The pcre_exec() function that is called by pcregrep
368                   to  do  the  matching  has  two parameters that can limit the                   to do the matching has two  parameters  that  can  limit  the
369                   resources that it uses.                   resources that it uses.
370    
371                   The  --match-limit  option  provides  a  means  of   limiting                   The   --match-limit  option  provides  a  means  of  limiting
372                   resource usage when processing patterns that are not going to                   resource usage when processing patterns that are not going to
373                   match, but which have a very large number of possibilities in                   match, but which have a very large number of possibilities in
374                   their  search  trees.  The  classic example is a pattern that                   their search trees. The classic example  is  a  pattern  that
375                   uses nested unlimited repeats. Internally, PCRE uses a  func-                   uses  nested unlimited repeats. Internally, PCRE uses a func-
376                   tion  called  match()  which  it  calls repeatedly (sometimes                   tion called match()  which  it  calls  repeatedly  (sometimes
377                   recursively). The limit set by --match-limit  is  imposed  on                   recursively).  The  limit  set by --match-limit is imposed on
378                   the  number  of times this function is called during a match,                   the number of times this function is called during  a  match,
379                   which has the effect of limiting the amount  of  backtracking                   which  has  the effect of limiting the amount of backtracking
380                   that can take place.                   that can take place.
381    
382                   The --recursion-limit option is similar to --match-limit, but                   The --recursion-limit option is similar to --match-limit, but
383                   instead of limiting the total number of times that match() is                   instead of limiting the total number of times that match() is
384                   called, it limits the depth of recursive calls, which in turn                   called, it limits the depth of recursive calls, which in turn
385                   limits the amount of memory that can be used.  The  recursion                   limits  the  amount of memory that can be used. The recursion
386                   depth  is  a  smaller  number than the total number of calls,                   depth is a smaller number than the  total  number  of  calls,
387                   because not all calls to match() are recursive. This limit is                   because not all calls to match() are recursive. This limit is
388                   of use only if it is set smaller than --match-limit.                   of use only if it is set smaller than --match-limit.
389    
390                   There  are no short forms for these options. The default set-                   There are no short forms for these options. The default  set-
391                   tings are specified when the PCRE library is  compiled,  with                   tings  are  specified when the PCRE library is compiled, with
392                   the default default being 10 million.                   the default default being 10 million.
393    
394         -M, --multiline         -M, --multiline
395                   Allow  patterns to match more than one line. When this option                   Allow patterns to match more than one line. When this  option
396                   is given, patterns may usefully contain literal newline char-                   is given, patterns may usefully contain literal newline char-
397                   acters  and  internal  occurrences of ^ and $ characters. The                   acters and internal occurrences of ^ and  $  characters.  The
398                   output for a successful match may consist of  more  than  one                   output  for  a  successful match may consist of more than one
399                   line,  the last of which is the one in which the match ended.                   line, the last of which is the one in which the match  ended.
400                   If the matched string ends with a newline sequence the output                   If the matched string ends with a newline sequence the output
401                   ends at the end of that line.                   ends at the end of that line.
402    
403                   When  this option is set, the PCRE library is called in "mul-                   When this option is set, the PCRE library is called in  "mul-
404                   tiline" mode.  There is a limit to the number of  lines  that                   tiline"  mode.   There is a limit to the number of lines that
405                   can  be matched, imposed by the way that pcregrep buffers the                   can be matched, imposed by the way that pcregrep buffers  the
406                   input file as it scans it. However, pcregrep ensures that  at                   input  file as it scans it. However, pcregrep ensures that at
407                   least 8K characters or the rest of the document (whichever is                   least 8K characters or the rest of the document (whichever is
408                   the shorter) are available for forward  matching,  and  simi-                   the  shorter)  are  available for forward matching, and simi-
409                   larly the previous 8K characters (or all the previous charac-                   larly the previous 8K characters (or all the previous charac-
410                   ters, if fewer than 8K) are guaranteed to  be  available  for                   ters,  if  fewer  than 8K) are guaranteed to be available for
411                   lookbehind  assertions.  This option does not work when input                   lookbehind assertions. This option does not work  when  input
412                   is read line by line (see --line-buffered.)                   is read line by line (see --line-buffered.)
413    
414         -N newline-type, --newline=newline-type         -N newline-type, --newline=newline-type
415                   The PCRE library  supports  five  different  conventions  for                   The  PCRE  library  supports  five  different conventions for
416                   indicating  the  ends of lines. They are the single-character                   indicating the ends of lines. They are  the  single-character
417                   sequences CR (carriage return) and LF  (linefeed),  the  two-                   sequences  CR  (carriage  return) and LF (linefeed), the two-
418                   character  sequence CRLF, an "anycrlf" convention, which rec-                   character sequence CRLF, an "anycrlf" convention, which  rec-
419                   ognizes any of the preceding three types, and an  "any"  con-                   ognizes  any  of the preceding three types, and an "any" con-
420                   vention, in which any Unicode line ending sequence is assumed                   vention, in which any Unicode line ending sequence is assumed
421                   to end a line. The Unicode sequences are the three just  men-                   to  end a line. The Unicode sequences are the three just men-
422                   tioned,   plus  VT  (vertical  tab,  U+000B),  FF  (formfeed,                   tioned, plus  VT  (vertical  tab,  U+000B),  FF  (form  feed,
423                   U+000C),  NEL  (next  line,  U+0085),  LS  (line   separator,                   U+000C),   NEL  (next  line,  U+0085),  LS  (line  separator,
424                   U+2028), and PS (paragraph separator, U+2029).                   U+2028), and PS (paragraph separator, U+2029).
425    
426                   When  the  PCRE  library  is  built,  a  default  line-ending                   When  the  PCRE  library  is  built,  a  default  line-ending
427                   sequence  is  specified.   This  is  normally  the   standard                   sequence   is  specified.   This  is  normally  the  standard
428                   sequence for the operating system. Unless otherwise specified                   sequence for the operating system. Unless otherwise specified
429                   by this option, pcregrep uses  the  library's  default.   The                   by  this  option,  pcregrep  uses the library's default.  The
430                   possible values for this option are CR, LF, CRLF, ANYCRLF, or                   possible values for this option are CR, LF, CRLF, ANYCRLF, or
431                   ANY. This makes it possible to use  pcregrep  on  files  that                   ANY.  This  makes  it  possible to use pcregrep on files that
432                   have  come  from  other environments without having to modify                   have come from other environments without  having  to  modify
433                   their line endings. If the data that is  being  scanned  does                   their  line  endings.  If the data that is being scanned does
434                   not  agree  with  the convention set by this option, pcregrep                   not agree with the convention set by  this  option,  pcregrep
435                   may behave in strange ways.                   may behave in strange ways.
436    
437         -n, --line-number         -n, --line-number
438                   Precede each output line by its line number in the file, fol-                   Precede each output line by its line number in the file, fol-
439                   lowed  by  a colon for matching lines or a hyphen for context                   lowed by a colon for matching lines or a hyphen  for  context
440                   lines. If the filename is also being output, it precedes  the                   lines.  If the filename is also being output, it precedes the
441                   line number. This option is forced if --line-offsets is used.                   line number. This option is forced if --line-offsets is used.
442    
443         -o, --only-matching         -o, --only-matching
444                   Show only the part of the line that matched a pattern instead                   Show only the part of the line that matched a pattern instead
445                   of the whole line. In this mode, no context  is  shown.  That                   of  the  whole  line. In this mode, no context is shown. That
446                   is,  the -A, -B, and -C options are ignored. If there is more                   is, the -A, -B, and -C options are ignored. If there is  more
447                   than one match in a line, each of them is  shown  separately.                   than  one  match in a line, each of them is shown separately.
448                   If  -o  is combined with -v (invert the sense of the match to                   If -o is combined with -v (invert the sense of the  match  to
449                   find non-matching lines), no output  is  generated,  but  the                   find  non-matching  lines),  no  output is generated, but the
450                   return  code  is set appropriately. If the matched portion of                   return code is set appropriately. If the matched  portion  of
451                   the line is empty, nothing is output unless the file name  or                   the  line is empty, nothing is output unless the file name or
452                   line  number  are being printed, in which case they are shown                   line number are being printed, in which case they  are  shown
453                   on an otherwise empty line. This option is mutually exclusive                   on an otherwise empty line. This option is mutually exclusive
454                   with --file-offsets and --line-offsets.                   with --file-offsets and --line-offsets.
455    
456         -onumber, --only-matching=number         -onumber, --only-matching=number
457                   Show  only  the  part  of the line that matched the capturing                   Show only the part of the line  that  matched  the  capturing
458                   parentheses of the given number. Up to 32 capturing parenthe-                   parentheses of the given number. Up to 32 capturing parenthe-
459                   ses are supported. Because these options can be given without                   ses are supported. Because these options can be given without
460                   an argument (see above), if an argument is present,  it  must                   an  argument  (see above), if an argument is present, it must
461                   be  given in the same shell item, for example, -o3 or --only-                   be given in the same shell item, for example, -o3 or  --only-
462                   matching=2. The comments  given  for  the  non-argument  case                   matching=2.  The  comments  given  for  the non-argument case
463                   above  also  apply  to  this case. If the specified capturing                   above also apply to this case.  If  the  specified  capturing
464                   parentheses do not exist in the pattern, or were not  set  in                   parentheses  do  not exist in the pattern, or were not set in
465                   the  match,  nothing  is  output unless the file name or line                   the match, nothing is output unless the  file  name  or  line
466                   number are being printed.                   number are being printed.
467    
468         -q, --quiet         -q, --quiet
469                   Work quietly, that is, display nothing except error messages.                   Work quietly, that is, display nothing except error messages.
470                   The  exit  status  indicates  whether or not any matches were                   The exit status indicates whether or  not  any  matches  were
471                   found.                   found.
472    
473         -r, --recursive         -r, --recursive
474                   If any given path is a directory, recursively scan the  files                   If  any given path is a directory, recursively scan the files
475                   it  contains, taking note of any --include and --exclude set-                   it contains, taking note of any --include and --exclude  set-
476                   tings. By default, a directory is read as a normal  file;  in                   tings.  By  default, a directory is read as a normal file; in
477                   some  operating  systems this gives an immediate end-of-file.                   some operating systems this gives an  immediate  end-of-file.
478                   This option is a shorthand  for  setting  the  -d  option  to                   This  option  is  a  shorthand  for  setting the -d option to
479                   "recurse".                   "recurse".
480    
481         --recursion-limit=number         --recursion-limit=number
482                   See --match-limit above.                   See --match-limit above.
483    
484         -s, --no-messages         -s, --no-messages
485                   Suppress  error  messages  about  non-existent  or unreadable                   Suppress error  messages  about  non-existent  or  unreadable
486                   files. Such files are quietly skipped.  However,  the  return                   files.  Such  files  are quietly skipped. However, the return
487                   code is still 2, even if matches were found in other files.                   code is still 2, even if matches were found in other files.
488    
489         -u, --utf-8         -u, --utf-8
490                   Operate  in UTF-8 mode. This option is available only if PCRE                   Operate in UTF-8 mode. This option is available only if  PCRE
491                   has been compiled with UTF-8 support. Both patterns and  sub-                   has  been compiled with UTF-8 support. Both patterns and sub-
492                   ject lines must be valid strings of UTF-8 characters.                   ject lines must be valid strings of UTF-8 characters.
493    
494         -V, --version         -V, --version
495                   Write  the  version  numbers of pcregrep and the PCRE library                   Write the version numbers of pcregrep and  the  PCRE  library
496                   that is being used to the standard error stream.                   that is being used to the standard error stream.
497    
498         -v, --invert-match         -v, --invert-match
499                   Invert the sense of the match, so that  lines  which  do  not                   Invert  the  sense  of  the match, so that lines which do not
500                   match any of the patterns are the ones that are found.                   match any of the patterns are the ones that are found.
501    
502         -w, --word-regex, --word-regexp         -w, --word-regex, --word-regexp
# Line 492  OPTIONS Line 504  OPTIONS
504                   lent to having \b at the start and end of the pattern.                   lent to having \b at the start and end of the pattern.
505    
506         -x, --line-regex, --line-regexp         -x, --line-regex, --line-regexp
507                   Force the patterns to be anchored (each must  start  matching                   Force  the  patterns to be anchored (each must start matching
508                   at  the beginning of a line) and in addition, require them to                   at the beginning of a line) and in addition, require them  to
509                   match entire lines. This is equivalent  to  having  ^  and  $                   match  entire  lines.  This  is  equivalent to having ^ and $
510                   characters at the start and end of each alternative branch in                   characters at the start and end of each alternative branch in
511                   every pattern.                   every pattern.
512    
513    
514  ENVIRONMENT VARIABLES  ENVIRONMENT VARIABLES
515    
516         The environment variables LC_ALL and LC_CTYPE  are  examined,  in  that         The  environment  variables  LC_ALL  and LC_CTYPE are examined, in that
517         order,  for  a  locale.  The first one that is set is used. This can be         order, for a locale. The first one that is set is  used.  This  can  be
518         overridden by the --locale option.  If  no  locale  is  set,  the  PCRE         overridden  by  the  --locale  option.  If  no  locale is set, the PCRE
519         library's default (usually the "C" locale) is used.         library's default (usually the "C" locale) is used.
520    
521    
522  NEWLINES  NEWLINES
523    
524         The  -N (--newline) option allows pcregrep to scan files with different         The -N (--newline) option allows pcregrep to scan files with  different
525         newline conventions from the default.  However,  the  setting  of  this         newline  conventions  from  the  default.  However, the setting of this
526         option  does not affect the way in which pcregrep writes information to         option does not affect the way in which pcregrep writes information  to
527         the standard error and output streams. It uses the  string  "\n"  in  C         the  standard  error  and  output streams. It uses the string "\n" in C
528         printf()  calls  to  indicate newlines, relying on the C I/O library to         printf() calls to indicate newlines, relying on the C  I/O  library  to
529         convert this to an appropriate sequence if the  output  is  sent  to  a         convert  this  to  an  appropriate  sequence if the output is sent to a
530         file.         file.
531    
532    
533  OPTIONS COMPATIBILITY  OPTIONS COMPATIBILITY
534    
535         Many  of the short and long forms of pcregrep's options are the same as         Many of the short and long forms of pcregrep's options are the same  as
536         in the GNU grep program (version 2.5.4). Any long option  of  the  form         in  the  GNU  grep program (version 2.5.4). Any long option of the form
537         --xxx-regexp  (GNU  terminology) is also available as --xxx-regex (PCRE         --xxx-regexp (GNU terminology) is also available as  --xxx-regex  (PCRE
538         terminology). However, the --file-offsets,  --include-dir,  --line-off-         terminology).  However,  the --file-offsets, --include-dir, --line-off-
539         sets, --locale, --match-limit, -M, --multiline, -N, --newline, --recur-         sets, --locale, --match-limit, -M, --multiline, -N, --newline, --recur-
540         sion-limit, -u, and --utf-8 options are specific to pcregrep, as is the         sion-limit, -u, and --utf-8 options are specific to pcregrep, as is the
541         use of the --only-matching option with a capturing parentheses number.         use of the --only-matching option with a capturing parentheses number.
542    
543         Although  most  of the common options work the same way, a few are dif-         Although most of the common options work the same way, a few  are  dif-
544         ferent in pcregrep. For example, the --include option's argument  is  a         ferent  in  pcregrep. For example, the --include option's argument is a
545         glob  for  GNU grep, but a regular expression for pcregrep. If both the         glob for GNU grep, but a regular expression for pcregrep. If  both  the
546         -c and -l options are given, GNU grep lists only  file  names,  without         -c  and  -l  options are given, GNU grep lists only file names, without
547         counts, but pcregrep gives the counts.         counts, but pcregrep gives the counts.
548    
549    
550  OPTIONS WITH DATA  OPTIONS WITH DATA
551    
552         There are four different ways in which an option with data can be spec-         There are four different ways in which an option with data can be spec-
553         ified.  If a short form option is used, the  data  may  follow  immedi-         ified.   If  a  short  form option is used, the data may follow immedi-
554         ately, or (with one exception) in the next command line item. For exam-         ately, or (with one exception) in the next command line item. For exam-
555         ple:         ple:
556    
557           -f/some/file           -f/some/file
558           -f /some/file           -f /some/file
559    
560         The exception is the -o option, which may appear with or without  data.         The  exception is the -o option, which may appear with or without data.
561         Because  of this, if data is present, it must follow immediately in the         Because of this, if data is present, it must follow immediately in  the
562         same item, for example -o3.         same item, for example -o3.
563    
564         If a long form option is used, the data may appear in the same  command         If  a long form option is used, the data may appear in the same command
565         line  item,  separated by an equals character, or (with two exceptions)         line item, separated by an equals character, or (with  two  exceptions)
566         it may appear in the next command line item. For example:         it may appear in the next command line item. For example:
567    
568           --file=/some/file           --file=/some/file
569           --file /some/file           --file /some/file
570    
571         Note, however, that if you want to supply a file name beginning with  ~         Note,  however, that if you want to supply a file name beginning with ~
572         as  data  in  a  shell  command,  and have the shell expand ~ to a home         as data in a shell command, and have the  shell  expand  ~  to  a  home
573         directory, you must separate the file name from the option, because the         directory, you must separate the file name from the option, because the
574         shell does not treat ~ specially unless it is at the start of an item.         shell does not treat ~ specially unless it is at the start of an item.
575    
576         The  exceptions  to the above are the --colour (or --color) and --only-         The exceptions to the above are the --colour (or --color)  and  --only-
577         matching options, for which the data  is  optional.  If  one  of  these         matching  options,  for  which  the  data  is optional. If one of these
578         options  does  have  data, it must be given in the first form, using an         options does have data, it must be given in the first  form,  using  an
579         equals character. Otherwise pcregrep will assume that it has no data.         equals character. Otherwise pcregrep will assume that it has no data.
580    
581    
582  MATCHING ERRORS  MATCHING ERRORS
583    
584         It is possible to supply a regular expression that takes  a  very  long         It  is  possible  to supply a regular expression that takes a very long
585         time  to  fail  to  match certain lines. Such patterns normally involve         time to fail to match certain lines.  Such  patterns  normally  involve
586         nested indefinite repeats, for example: (a+)*\d when matched against  a         nested  indefinite repeats, for example: (a+)*\d when matched against a
587         line  of  a's  with  no  final  digit. The PCRE matching function has a         line of a's with no final digit.  The  PCRE  matching  function  has  a
588         resource limit that causes it to abort in these circumstances. If  this         resource  limit that causes it to abort in these circumstances. If this
589         happens, pcregrep outputs an error message and the line that caused the         happens, pcregrep outputs an error message and the line that caused the
590         problem to the standard error stream. If there are more  than  20  such         problem  to  the  standard error stream. If there are more than 20 such
591         errors, pcregrep gives up.         errors, pcregrep gives up.
592    
593         The  --match-limit  option  of  pcregrep can be used to set the overall         The --match-limit option of pcregrep can be used  to  set  the  overall
594         resource limit; there is a second option called --recursion-limit  that         resource  limit; there is a second option called --recursion-limit that
595         sets  a limit on the amount of memory (usually stack) that is used (see         sets a limit on the amount of memory (usually stack) that is used  (see
596         the discussion of these options above).         the discussion of these options above).
597    
598    
599  DIAGNOSTICS  DIAGNOSTICS
600    
601         Exit status is 0 if any matches were found, 1 if no matches were found,         Exit status is 0 if any matches were found, 1 if no matches were found,
602         and  2 for syntax errors and non-existent or inacessible files (even if         and 2 for syntax errors, overlong lines, non-existent  or  inaccessible
603         matches were found in other files) or too many matching  errors.  Using         files  (even if matches were found in other files) or too many matching
604         the  -s  option to suppress error messages about inaccessble files does         errors. Using the -s option to suppress error messages about inaccessi-
605         not affect the return code.         ble files does not affect the return code.
606    
607    
608  SEE ALSO  SEE ALSO
# Line 607  AUTHOR Line 619  AUTHOR
619    
620  REVISION  REVISION
621    
622         Last updated: 14 January 2011         Last updated: 30 July 2011
623         Copyright (c) 1997-2011 University of Cambridge.         Copyright (c) 1997-2011 University of Cambridge.

Legend:
Removed from v.653  
changed lines
  Added in v.654

  ViewVC Help
Powered by ViewVC 1.1.5