/[pcre]/code/trunk/doc/pcregrep.txt
ViewVC logotype

Diff of /code/trunk/doc/pcregrep.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1193 by ph10, Sat Mar 31 18:09:26 2012 UTC revision 1194 by ph10, Wed Oct 31 17:42:29 2012 UTC
# Line 26  DESCRIPTION Line 26  DESCRIPTION
26         with  slashes,  as  is common in Perl scripts), they are interpreted as         with  slashes,  as  is common in Perl scripts), they are interpreted as
27         part of the pattern. Quotes can of course be used to  delimit  patterns         part of the pattern. Quotes can of course be used to  delimit  patterns
28         on  the  command  line  because  they are interpreted by the shell, and         on  the  command  line  because  they are interpreted by the shell, and
29         indeed they are required if a pattern contains  white  space  or  shell         indeed quotes are required if a pattern contains white space  or  shell
30         metacharacters.         metacharacters.
31    
32         The  first  argument that follows any option settings is treated as the         The  first  argument that follows any option settings is treated as the
# Line 56  DESCRIPTION Line 56  DESCRIPTION
56         times  this  size  is used (to allow for buffering "before" and "after"         times  this  size  is used (to allow for buffering "before" and "after"
57         lines). An error occurs if a line overflows the buffer.         lines). An error occurs if a line overflows the buffer.
58    
59         Patterns are limited to 8K or BUFSIZ bytes, whichever is  the  greater.         Patterns can be no longer than 8K or BUFSIZ  bytes,  whichever  is  the
60         BUFSIZ  is  defined  in  <stdio.h>. When there is more than one pattern         greater.   BUFSIZ  is defined in <stdio.h>. When there is more than one
61         (specified by the use of -e and/or -f), each pattern is applied to each         pattern (specified by the use of -e and/or -f), each pattern is applied
62         line  in  the  order  in which they are defined, except that all the -e         to  each  line  in the order in which they are defined, except that all
63         patterns are tried before the -f patterns.         the -e patterns are tried before the -f patterns.
64    
65         By default, as soon as one pattern matches (or fails to match  when  -v         By default, as soon as one pattern matches a line, no further  patterns
66         is  used), no further patterns are considered. However, if --colour (or         are considered. However, if --colour (or --color) is used to colour the
67         --color) is used to colour the matching substrings, or if --only-match-         matching substrings, or if --only-matching, --file-offsets, or  --line-
68         ing,  --file-offsets, or --line-offsets is used to output only the part         offsets  is  used  to  output  only  the  part of the line that matched
69         of the line that matched (either shown literally,  or  as  an  offset),         (either shown literally, or as an offset), scanning resumes immediately
70         scanning  resumes  immediately  following  the  match,  so that further         following  the  match,  so that further matches on the same line can be
71         matches on the same line can be found. If there are multiple  patterns,         found. If there are multiple  patterns,  they  are  all  tried  on  the
72         they are all tried on the remainder of the line, but patterns that fol-         remainder  of  the  line, but patterns that follow the one that matched
73         low the one that matched are not tried on the earlier part of the line.         are not tried on the earlier part of the line.
74    
75         This is the same behaviour as GNU grep, but it does mean that the order         This behaviour means that the order  in  which  multiple  patterns  are
76         in which multiple patterns are specified can affect the output when one         specified  can affect the output when one of the above options is used.
77         of the above options is used.         This is no longer the same behaviour as GNU grep, which now manages  to
78           display  earlier  matches  for  later  patterns (as long as there is no
79           overlap).
80    
81         Patterns that can match an empty string are accepted, but empty  string         Patterns that can match an empty string are accepted, but empty  string
82         matches   are   never   recognized.   An   example   is   the   pattern         matches   are   never   recognized.   An   example   is   the   pattern
# Line 112  OPTIONS Line 114  OPTIONS
114         The order in which some of the options appear can  affect  the  output.         The order in which some of the options appear can  affect  the  output.
115         For  example,  both  the  -h and -l options affect the printing of file         For  example,  both  the  -h and -l options affect the printing of file
116         names. Whichever comes later in the command line will be the  one  that         names. Whichever comes later in the command line will be the  one  that
117         takes  effect.  Numerical values for options may be followed by K or M,         takes  effect.  Similarly,  except  where  noted below, if an option is
118         to signify multiplication by 1024 or 1024*1024 respectively.         given twice, the later setting is used. Numerical  values  for  options
119           may  be  followed  by  K  or  M,  to  signify multiplication by 1024 or
120           1024*1024 respectively.
121    
122         --        This terminates the list of options. It is useful if the next         --        This terminates the list of options. It is useful if the next
123                   item  on  the command line starts with a hyphen but is not an                   item  on  the command line starts with a hyphen but is not an
# Line 208  OPTIONS Line 212  OPTIONS
212    
213         -d action, --directories=action         -d action, --directories=action
214                   If an input path is a directory, "action" specifies how it is                   If an input path is a directory, "action" specifies how it is
215                   to be processed.  Valid  values  are  "read"  (the  default),                   to be processed.  Valid values are  "read"  (the  default  in
216                   "recurse"  (equivalent to the -r option), or "skip" (silently                   non-Windows  environments,  for compatibility with GNU grep),
217                   skip the path). In the default case, directories are read  as                   "recurse" (equivalent to the -r option), or "skip"  (silently
218                   if  they  were  ordinary files. In some operating systems the                   skip  the  path, the default in Windows environments). In the
219                   effect of reading a directory like this is an immediate  end-                   "read" case, directories are read as if  they  were  ordinary
220                   of-file.                   files.  In  some  operating  systems  the effect of reading a
221                     directory like this is an immediate end-of-file; in others it
222                     may provoke an error.
223    
224         -e pattern, --regex=pattern, --regexp=pattern         -e pattern, --regex=pattern, --regexp=pattern
225                   Specify a pattern to be matched. This option can be used mul-                   Specify a pattern to be matched. This option can be used mul-
# Line 221  OPTIONS Line 227  OPTIONS
227                   be  used  as a way of specifying a single pattern that starts                   be  used  as a way of specifying a single pattern that starts
228                   with a hyphen. When -e is used, no argument pattern is  taken                   with a hyphen. When -e is used, no argument pattern is  taken
229                   from  the  command  line;  all  arguments are treated as file                   from  the  command  line;  all  arguments are treated as file
230                   names. There is an overall maximum of 100 patterns. They  are                   names. There is no limit to the number of patterns. They  are
231                   applied  to  each line in the order in which they are defined                   applied  to  each line in the order in which they are defined
232                   until one matches (or fails to match if -v is used). If -f is                   until one matches.
233                   used  with  -e,  the command line patterns are matched first,  
234                   followed by the patterns from the file,  independent  of  the                   If -f is used with -e, the command line patterns are  matched
235                   order  in which these options are specified. Note that multi-                   first, followed by the patterns from the file(s), independent
236                   ple use of -e is not the same as a single pattern with alter-                   of the order in which these options are specified. Note  that
237                   natives. For example, X|Y finds the first character in a line                   multiple  use  of -e is not the same as a single pattern with
238                   that is X or Y, whereas if the two patterns are  given  sepa-                   alternatives. For example, X|Y finds the first character in a
239                   rately, pcregrep finds X if it is present, even if it follows                   line  that  is  X or Y, whereas if the two patterns are given
240                   Y in the line. It finds Y only if there is no X in the  line.                   separately, with X first, pcregrep finds X if it is  present,
241                   This  really  matters  only  if  you are using -o to show the                   even if it follows Y in the line. It finds Y only if there is
242                   part(s) of the line that matched.                   no X in the line. This matters only if you are  using  -o  or
243                     --colo(u)r to show the part(s) of the line that matched.
244    
245         --exclude=pattern         --exclude=pattern
246                   When pcregrep is searching the files in a directory as a con-                   Files (but not directories) whose names match the pattern are
247                   sequence  of  the  -r  (recursive search) option, any regular                   skipped without being processed. This applies to  all  files,
248                   files whose names match the pattern are excluded. Subdirecto-                   whether  listed  on  the  command line, obtained from --file-
249                   ries  are  not  excluded  by  this  option; they are searched                   list, or by scanning a directory. The pattern is a PCRE regu-
250                   recursively, subject to the --exclude-dir  and  --include_dir                   lar expression, and is matched against the final component of
251                   options.  The  pattern  is  a PCRE regular expression, and is                   the file name, not the  entire  path.  The  -F,  -w,  and  -x
252                   matched against the final component of the file name (not the                   options do not apply to this pattern. The option may be given
253                   entire  path).  If  a  file  name  matches both --include and                   any number of times in order to specify multiple patterns. If
254                   --exclude, it is excluded.  There is no short form  for  this                   a  file  name matches both an --include and an --exclude pat-
255                   option.                   tern, it is excluded. There is no short form for this option.
256    
257           --exclude-from=filename
258                     Treat each non-empty line of the file  as  the  data  for  an
259                     --exclude option. What constitutes a newline when reading the
260                     file is the operating system's default. The --newline  option
261                     has  no  effect on this option. This option may be given more
262                     than once in order to specify a number of files to read.
263    
264         --exclude-dir=pattern         --exclude-dir=pattern
265                   When  pcregrep  is searching the contents of a directory as a                   Directories whose names match the pattern are skipped without
266                   consequence of the -r (recursive search) option,  any  subdi-                   being  processed,  whatever  the  setting  of the --recursive
267                   rectories  whose  names match the pattern are excluded. (Note                   option. This applies to all directories,  whether  listed  on
268                   that the --exclude option does  not  affect  subdirectories.)                   the command line, obtained from --file-list, or by scanning a
269                   The  pattern  is  a  PCRE  regular expression, and is matched                   parent directory. The pattern is a PCRE  regular  expression,
270                   against the final component  of  the  name  (not  the  entire                   and  is  matched against the final component of the directory
271                   path).  If a subdirectory name matches both --include-dir and                   name, not the entire path. The -F, -w, and -x options do  not
272                   --exclude-dir, it is excluded. There is  no  short  form  for                   apply  to this pattern. The option may be given any number of
273                   this option.                   times in order to specify more than one pattern. If a  direc-
274                     tory  matches  both  --include-dir  and  --exclude-dir, it is
275                     excluded. There is no short form for this option.
276    
277         -F, --fixed-strings         -F, --fixed-strings
278                   Interpret  each pattern as a list of fixed strings, separated                   Interpret each data-matching  pattern  as  a  list  of  fixed
279                   by newlines, instead of  as  a  regular  expression.  The  -w                   strings,  separated  by  newlines,  instead  of  as a regular
280                   (match  as  a  word) and -x (match whole line) options can be                   expression. What constitutes a newline for  this  purpose  is
281                   used with -F. They apply to each of the fixed strings. A line                   controlled  by the --newline option. The -w (match as a word)
282                   is selected if any of the fixed strings are found in it (sub-                   and -x (match whole line) options can be used with -F.   They
283                   ject to -w or -x, if present).                   apply to each of the fixed strings. A line is selected if any
284                     of the fixed strings are found in it (subject to -w or -x, if
285                     present).  This  option applies only to the patterns that are
286                     matched against the contents of files; it does not  apply  to
287                     patterns  specified  by  any  of  the  --include or --exclude
288                     options.
289    
290         -f filename, --file=filename         -f filename, --file=filename
291                   Read a number of patterns from the file, one  per  line,  and                   Read patterns from the file, one per  line,  and  match  them
292                   match  them against each line of input. A data line is output                   against  each  line of input. What constitutes a newline when
293                   if any of the patterns match it. The filename can be given as                   reading the file  is  the  operating  system's  default.  The
294                   "-" to refer to the standard input. When -f is used, patterns                   --newline option has no effect on this option. Trailing white
295                   specified on the command line using -e may also  be  present;                   space is removed from each line, and blank lines are ignored.
296                   they are tested before the file's patterns. However, no other                   An  empty  file  contains  no  patterns and therefore matches
297                   pattern is taken from the command  line;  all  arguments  are                   nothing. See also the comments about multiple patterns versus
298                   treated  as  the  names  of paths to be searched. There is an                   a  single  pattern with alternatives in the description of -e
299                   overall maximum of 100  patterns.  Trailing  white  space  is                   above.
300                   removed from each line, and blank lines are ignored. An empty  
301                   file contains no patterns and therefore matches nothing.  See                   If this option is given more than  once,  all  the  specified
302                   also  the  comments  about  multiple patterns versus a single                   files  are read. A data line is output if any of the patterns
303                   pattern with alternatives in the description of -e above.                   match it. A filename can be given as  "-"  to  refer  to  the
304                     standard  input.  When  -f is used, patterns specified on the
305                     command line using -e may also be present;  they  are  tested
306                     before  the  file's  patterns.  However,  no other pattern is
307                     taken from the command line; all arguments are treated as the
308                     names of paths to be searched.
309    
310         --file-list=filename         --file-list=filename
311                   Read a list of files to be searched from the given file,  one                   Read  a  list  of  files  and/or  directories  that are to be
312                   per line. Trailing white space is removed from each line, and                   scanned from the given file, one  per  line.  Trailing  white
313                   blank lines are ignored. These files are searched before  any                   space is removed from each line, and blank lines are ignored.
314                   others  that  may be listed on the command line. The filename                   These paths are processed before any that are listed  on  the
315                   can be given as "-" to refer to the standard input. If --file                   command  line.  The  filename can be given as "-" to refer to
316                   and  --file-list are both specified as "-", patterns are read                   the standard input.  If --file and --file-list are both spec-
317                   first. This is useful only when the standard input is a  ter-                   ified  as  "-",  patterns are read first. This is useful only
318                   minal,  from  which  further lines (the list of files) can be                   when the standard input is a  terminal,  from  which  further
319                   read after an end-of-file indication.                   lines  (the  list  of files) can be read after an end-of-file
320                     indication. If this option is given more than once,  all  the
321                     specified files are read.
322    
323         --file-offsets         --file-offsets
324                   Instead of showing lines or parts of lines that  match,  show                   Instead  of  showing lines or parts of lines that match, show
325                   each  match  as  an  offset  from the start of the file and a                   each match as an offset from the start  of  the  file  and  a
326                   length, separated by a comma. In this  mode,  no  context  is                   length,  separated  by  a  comma. In this mode, no context is
327                   shown.  That  is,  the -A, -B, and -C options are ignored. If                   shown. That is, the -A, -B, and -C options  are  ignored.  If
328                   there is more than one match in a line, each of them is shown                   there is more than one match in a line, each of them is shown
329                   separately.  This  option  is mutually exclusive with --line-                   separately. This option is mutually  exclusive  with  --line-
330                   offsets and --only-matching.                   offsets and --only-matching.
331    
332         -H, --with-filename         -H, --with-filename
333                   Force the inclusion of the filename at the  start  of  output                   Force  the  inclusion  of the filename at the start of output
334                   lines  when searching a single file. By default, the filename                   lines when searching a single file. By default, the  filename
335                   is not shown in this case. For matching lines,  the  filename                   is  not  shown in this case. For matching lines, the filename
336                   is followed by a colon; for context lines, a hyphen separator                   is followed by a colon; for context lines, a hyphen separator
337                   is used. If a line number is also being  output,  it  follows                   is  used.  If  a line number is also being output, it follows
338                   the file name.                   the file name.
339    
340         -h, --no-filename         -h, --no-filename
341                   Suppress  the output filenames when searching multiple files.                   Suppress the output filenames when searching multiple  files.
342                   By default, filenames  are  shown  when  multiple  files  are                   By  default,  filenames  are  shown  when  multiple files are
343                   searched.  For  matching lines, the filename is followed by a                   searched. For matching lines, the filename is followed  by  a
344                   colon; for context lines, a hyphen separator is used.   If  a                   colon;  for  context lines, a hyphen separator is used.  If a
345                   line number is also being output, it follows the file name.                   line number is also being output, it follows the file name.
346    
347         --help    Output  a  help  message, giving brief details of the command         --help    Output a help message, giving brief details  of  the  command
348                   options and file type support, and then exit.                   options  and  file type support, and then exit. Anything else
349                     on the command line is ignored.
350    
351         -I        Treat binary files as never matching. This is  equivalent  to         -I        Treat binary files as never matching. This is  equivalent  to
352                   --binary-files=without-match.                   --binary-files=without-match.
# Line 326  OPTIONS Line 355  OPTIONS
355                   Ignore upper/lower case distinctions during comparisons.                   Ignore upper/lower case distinctions during comparisons.
356    
357         --include=pattern         --include=pattern
358                   When pcregrep is searching the files in a directory as a con-                   If  any --include patterns are specified, the only files that
359                   sequence of the -r (recursive search) option, only those reg-                   are processed are those that match one of the  patterns  (and
360                   ular files whose names match the pattern are included. Subdi-                   do  not  match  an  --exclude  pattern). This option does not
361                   rectories are always included and searched recursively,  sub-                   affect directories, but it  applies  to  all  files,  whether
362                   ject to the --include-dir and --exclude-dir options. The pat-                   listed  on the command line, obtained from --file-list, or by
363                   tern is a PCRE regular expression, and is matched against the                   scanning a directory. The pattern is a PCRE  regular  expres-
364                   final  component of the file name (not the entire path). If a                   sion,  and is matched against the final component of the file
365                   file  name  matches  both  --include  and  --exclude,  it  is                   name, not the entire path. The -F, -w, and -x options do  not
366                   excluded. There is no short form for this option.                   apply  to this pattern. The option may be given any number of
367                     times. If a file  name  matches  both  an  --include  and  an
368                     --exclude  pattern,  it  is excluded.  There is no short form
369                     for this option.
370    
371           --include-from=filename
372                     Treat each non-empty line of the file  as  the  data  for  an
373                     --include option. What constitutes a newline for this purpose
374                     is the operating system's default. The --newline  option  has
375                     no effect on this option. This option may be given any number
376                     of times; all the files are read.
377    
378         --include-dir=pattern         --include-dir=pattern
379                   When  pcregrep  is searching the contents of a directory as a                   If any --include-dir patterns are specified, the only  direc-
380                   consequence of the -r (recursive search) option,  only  those                   tories  that  are  processed  are those that match one of the
381                   subdirectories  whose  names  match the pattern are included.                   patterns (and do not match an  --exclude-dir  pattern).  This
382                   (Note that the --include option does not  affect  subdirecto-                   applies  to  all  directories,  whether listed on the command
383                   ries.)  The  pattern  is  a  PCRE  regular expression, and is                   line, obtained from --file-list,  or  by  scanning  a  parent
384                   matched against the final component  of  the  name  (not  the                   directory.  The  pattern is a PCRE regular expression, and is
385                   entire  path). If a subdirectory name matches both --include-                   matched against the final component of  the  directory  name,
386                   dir and --exclude-dir, it is excluded. There is no short form                   not  the entire path. The -F, -w, and -x options do not apply
387                   for this option.                   to this pattern. The option may be given any number of times.
388                     If  a directory matches both --include-dir and --exclude-dir,
389                     it is excluded. There is no short form for this option.
390    
391         -L, --files-without-match         -L, --files-without-match
392                   Instead  of  outputting lines from the files, just output the                   Instead of outputting lines from the files, just  output  the
393                   names of the files that do not contain any lines  that  would                   names  of  the files that do not contain any lines that would
394                   have  been  output. Each file name is output once, on a sepa-                   have been output. Each file name is output once, on  a  sepa-
395                   rate line.                   rate line.
396    
397         -l, --files-with-matches         -l, --files-with-matches
398                   Instead of outputting lines from the files, just  output  the                   Instead  of  outputting lines from the files, just output the
399                   names of the files containing lines that would have been out-                   names of the files containing lines that would have been out-
400                   put. Each file name is  output  once,  on  a  separate  line.                   put.  Each  file  name  is  output  once, on a separate line.
401                   Searching  normally stops as soon as a matching line is found                   Searching normally stops as soon as a matching line is  found
402                   in a file. However, if the -c (count) option  is  also  used,                   in  a  file.  However, if the -c (count) option is also used,
403                   matching  continues in order to obtain the correct count, and                   matching continues in order to obtain the correct count,  and
404                   those files that have at least one  match  are  listed  along                   those  files  that  have  at least one match are listed along
405                   with their counts. Using this option with -c is a way of sup-                   with their counts. Using this option with -c is a way of sup-
406                   pressing the listing of files with no matches.                   pressing the listing of files with no matches.
407    
# Line 370  OPTIONS Line 411  OPTIONS
411                   input)" is used. There is no short form for this option.                   input)" is used. There is no short form for this option.
412    
413         --line-buffered         --line-buffered
414                   When this option is given, input is read and  processed  line                   When  this  option is given, input is read and processed line
415                   by  line,  and  the  output  is  flushed after each write. By                   by line, and the output  is  flushed  after  each  write.  By
416                   default, input is read in large chunks, unless  pcregrep  can                   default,  input  is read in large chunks, unless pcregrep can
417                   determine  that  it is reading from a terminal (which is cur-                   determine that it is reading from a terminal (which  is  cur-
418                   rently possible only in Unix environments). Output to  termi-                   rently  possible  only  in Unix-like environments). Output to
419                   nal  is  normally automatically flushed by the operating sys-                   terminal is normally automatically flushed by  the  operating
420                   tem. This option can be useful when the input  or  output  is                   system. This option can be useful when the input or output is
421                   attached  to a pipe and you do not want pcregrep to buffer up                   attached to a pipe and you do not want pcregrep to buffer  up
422                   large amounts of data. However, its use will  affect  perfor-                   large  amounts  of data. However, its use will affect perfor-
423                   mance, and the -M (multiline) option ceases to work.                   mance, and the -M (multiline) option ceases to work.
424    
425         --line-offsets         --line-offsets
426                   Instead  of  showing lines or parts of lines that match, show                   Instead of showing lines or parts of lines that  match,  show
427                   each match as a line number, the offset from the start of the                   each match as a line number, the offset from the start of the
428                   line,  and a length. The line number is terminated by a colon                   line, and a length. The line number is terminated by a  colon
429                   (as usual; see the -n option), and the offset and length  are                   (as  usual; see the -n option), and the offset and length are
430                   separated  by  a  comma.  In  this mode, no context is shown.                   separated by a comma. In this  mode,  no  context  is  shown.
431                   That is, the -A, -B, and -C options are ignored. If there  is                   That  is, the -A, -B, and -C options are ignored. If there is
432                   more  than  one  match in a line, each of them is shown sepa-                   more than one match in a line, each of them  is  shown  sepa-
433                   rately. This option is mutually exclusive with --file-offsets                   rately. This option is mutually exclusive with --file-offsets
434                   and --only-matching.                   and --only-matching.
435    
436         --locale=locale-name         --locale=locale-name
437                   This  option specifies a locale to be used for pattern match-                   This option specifies a locale to be used for pattern  match-
438                   ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-                   ing.  It  overrides the value in the LC_ALL or LC_CTYPE envi-
439                   ronment  variables.  If  no  locale  is  specified,  the PCRE                   ronment variables.  If  no  locale  is  specified,  the  PCRE
440                   library's default (usually the "C" locale) is used. There  is                   library's  default (usually the "C" locale) is used. There is
441                   no short form for this option.                   no short form for this option.
442    
443         --match-limit=number         --match-limit=number
444                   Processing  some  regular  expression  patterns can require a                   Processing some regular expression  patterns  can  require  a
445                   very large amount of memory, leading in some cases to a  pro-                   very  large amount of memory, leading in some cases to a pro-
446                   gram  crash  if  not enough is available.  Other patterns may                   gram crash if not enough is available.   Other  patterns  may
447                   take a very long time to search  for  all  possible  matching                   take  a  very  long  time to search for all possible matching
448                   strings.  The pcre_exec() function that is called by pcregrep                   strings. The pcre_exec() function that is called by  pcregrep
449                   to do the matching has two  parameters  that  can  limit  the                   to  do  the  matching  has  two parameters that can limit the
450                   resources that it uses.                   resources that it uses.
451    
452                   The   --match-limit  option  provides  a  means  of  limiting                   The  --match-limit  option  provides  a  means  of   limiting
453                   resource usage when processing patterns that are not going to                   resource usage when processing patterns that are not going to
454                   match, but which have a very large number of possibilities in                   match, but which have a very large number of possibilities in
455                   their search trees. The classic example  is  a  pattern  that                   their  search  trees.  The  classic example is a pattern that
456                   uses  nested unlimited repeats. Internally, PCRE uses a func-                   uses nested unlimited repeats. Internally, PCRE uses a  func-
457                   tion called match()  which  it  calls  repeatedly  (sometimes                   tion  called  match()  which  it  calls repeatedly (sometimes
458                   recursively).  The  limit  set by --match-limit is imposed on                   recursively). The limit set by --match-limit  is  imposed  on
459                   the number of times this function is called during  a  match,                   the  number  of times this function is called during a match,
460                   which  has  the effect of limiting the amount of backtracking                   which has the effect of limiting the amount  of  backtracking
461                   that can take place.                   that can take place.
462    
463                   The --recursion-limit option is similar to --match-limit, but                   The --recursion-limit option is similar to --match-limit, but
464                   instead of limiting the total number of times that match() is                   instead of limiting the total number of times that match() is
465                   called, it limits the depth of recursive calls, which in turn                   called, it limits the depth of recursive calls, which in turn
466                   limits  the  amount of memory that can be used. The recursion                   limits the amount of memory that can be used.  The  recursion
467                   depth is a smaller number than the  total  number  of  calls,                   depth  is  a  smaller  number than the total number of calls,
468                   because not all calls to match() are recursive. This limit is                   because not all calls to match() are recursive. This limit is
469                   of use only if it is set smaller than --match-limit.                   of use only if it is set smaller than --match-limit.
470    
471                   There are no short forms for these options. The default  set-                   There  are no short forms for these options. The default set-
472                   tings  are  specified when the PCRE library is compiled, with                   tings are specified when the PCRE library is  compiled,  with
473                   the default default being 10 million.                   the default default being 10 million.
474    
475         -M, --multiline         -M, --multiline
476                   Allow patterns to match more than one line. When this  option                   Allow  patterns to match more than one line. When this option
477                   is given, patterns may usefully contain literal newline char-                   is given, patterns may usefully contain literal newline char-
478                   acters and internal occurrences of ^ and  $  characters.  The                   acters  and  internal  occurrences of ^ and $ characters. The
479                   output  for  a  successful match may consist of more than one                   output for a successful match may consist of  more  than  one
480                   line, the last of which is the one in which the match  ended.                   line,  the last of which is the one in which the match ended.
481                   If the matched string ends with a newline sequence the output                   If the matched string ends with a newline sequence the output
482                   ends at the end of that line.                   ends at the end of that line.
483    
484                   When this option is set, the PCRE library is called in  "mul-                   When  this option is set, the PCRE library is called in "mul-
485                   tiline"  mode.   There is a limit to the number of lines that                   tiline" mode.  There is a limit to the number of  lines  that
486                   can be matched, imposed by the way that pcregrep buffers  the                   can  be matched, imposed by the way that pcregrep buffers the
487                   input  file as it scans it. However, pcregrep ensures that at                   input file as it scans it. However, pcregrep ensures that  at
488                   least 8K characters or the rest of the document (whichever is                   least 8K characters or the rest of the document (whichever is
489                   the  shorter)  are  available for forward matching, and simi-                   the shorter) are available for forward  matching,  and  simi-
490                   larly the previous 8K characters (or all the previous charac-                   larly the previous 8K characters (or all the previous charac-
491                   ters,  if  fewer  than 8K) are guaranteed to be available for                   ters, if fewer than 8K) are guaranteed to  be  available  for
492                   lookbehind assertions. This option does not work  when  input                   lookbehind  assertions.  This option does not work when input
493                   is read line by line (see --line-buffered.)                   is read line by line (see --line-buffered.)
494    
495         -N newline-type, --newline=newline-type         -N newline-type, --newline=newline-type
496                   The  PCRE  library  supports  five  different conventions for                   The PCRE library  supports  five  different  conventions  for
497                   indicating the ends of lines. They are  the  single-character                   indicating  the  ends of lines. They are the single-character
498                   sequences  CR  (carriage  return) and LF (linefeed), the two-                   sequences CR (carriage return) and LF  (linefeed),  the  two-
499                   character sequence CRLF, an "anycrlf" convention, which  rec-                   character  sequence CRLF, an "anycrlf" convention, which rec-
500                   ognizes  any  of the preceding three types, and an "any" con-                   ognizes any of the preceding three types, and an  "any"  con-
501                   vention, in which any Unicode line ending sequence is assumed                   vention, in which any Unicode line ending sequence is assumed
502                   to  end a line. The Unicode sequences are the three just men-                   to end a line. The Unicode sequences are the three just  men-
503                   tioned, plus  VT  (vertical  tab,  U+000B),  FF  (form  feed,                   tioned,  plus  VT  (vertical  tab,  U+000B),  FF  (form feed,
504                   U+000C),   NEL  (next  line,  U+0085),  LS  (line  separator,                   U+000C),  NEL  (next  line,  U+0085),  LS  (line   separator,
505                   U+2028), and PS (paragraph separator, U+2029).                   U+2028), and PS (paragraph separator, U+2029).
506    
507                   When  the  PCRE  library  is  built,  a  default  line-ending                   When  the  PCRE  library  is  built,  a  default  line-ending
508                   sequence   is  specified.   This  is  normally  the  standard                   sequence  is  specified.   This  is  normally  the   standard
509                   sequence for the operating system. Unless otherwise specified                   sequence for the operating system. Unless otherwise specified
510                   by  this  option,  pcregrep  uses the library's default.  The                   by this option, pcregrep uses  the  library's  default.   The
511                   possible values for this option are CR, LF, CRLF, ANYCRLF, or                   possible values for this option are CR, LF, CRLF, ANYCRLF, or
512                   ANY.  This  makes  it  possible to use pcregrep on files that                   ANY. This makes it possible to use  pcregrep  to  scan  files
513                   have come from other environments without  having  to  modify                   that have come from other environments without having to mod-
514                   their  line  endings.  If the data that is being scanned does                   ify their line endings. If the data  that  is  being  scanned
515                   not agree with the convention set by  this  option,  pcregrep                   does  not agree with the convention set by this option, pcre-
516                   may behave in strange ways.                   grep may behave in strange ways. Note that this  option  does
517                     not  apply  to  files specified by the -f, --exclude-from, or
518                     --include-from options, which are expected to use the operat-
519                     ing system's standard newline sequence.
520    
521         -n, --line-number         -n, --line-number
522                   Precede each output line by its line number in the file, fol-                   Precede each output line by its line number in the file, fol-
# Line 503  OPTIONS Line 547  OPTIONS
547         -onumber, --only-matching=number         -onumber, --only-matching=number
548                   Show  only  the  part  of the line that matched the capturing                   Show  only  the  part  of the line that matched the capturing
549                   parentheses of the given number. Up to 32 capturing parenthe-                   parentheses of the given number. Up to 32 capturing parenthe-
550                   ses are supported. Because these options can be given without                   ses are supported, and -o0 is equivalent to -o without a num-
551                   an argument (see above), if an argument is present,  it  must                   ber. Because these options can be given without  an  argument
552                   be  given in the same shell item, for example, -o3 or --only-                   (see  above),  if an argument is present, it must be given in
553                   matching=2. The comments  given  for  the  non-argument  case                   the same shell item, for example, -o3  or  --only-matching=2.
554                   above  also  apply  to  this case. If the specified capturing                   The comments given for the non-argument case above also apply
555                   parentheses do not exist in the pattern, or were not  set  in                   to this case. If the specified capturing parentheses  do  not
556                   the  match,  nothing  is  output unless the file name or line                   exist  in  the pattern, or were not set in the match, nothing
557                   number are being printed.                   is output unless the file  name  or  line  number  are  being
558                     printed.
559    
560                     If  this  option is given multiple times, multiple substrings
561                     are output, in the order the options are given. For  example,
562                     -o3 -o1 -o3 causes the substrings matched by capturing paren-
563                     theses 3 and 1 and then 3 again to  be  output.  By  default,
564                     there is no separator (but see the next option).
565    
566           --om-separator=text
567                     Specify  a  separating string for multiple occurrences of -o.
568                     The default is an empty string. Separating strings are  never
569                     coloured.
570    
571         -q, --quiet         -q, --quiet
572                   Work quietly, that is, display nothing except error messages.                   Work quietly, that is, display nothing except error messages.
573                   The  exit  status  indicates  whether or not any matches were                   The exit status indicates whether or  not  any  matches  were
574                   found.                   found.
575    
576         -r, --recursive         -r, --recursive
577                   If any given path is a directory, recursively scan the  files                   If  any given path is a directory, recursively scan the files
578                   it  contains, taking note of any --include and --exclude set-                   it contains, taking note of any --include and --exclude  set-
579                   tings. By default, a directory is read as a normal  file;  in                   tings.  By  default, a directory is read as a normal file; in
580                   some  operating  systems this gives an immediate end-of-file.                   some operating systems this gives an  immediate  end-of-file.
581                   This option is a shorthand  for  setting  the  -d  option  to                   This  option  is  a  shorthand  for  setting the -d option to
582                   "recurse".                   "recurse".
583    
584         --recursion-limit=number         --recursion-limit=number
585                   See --match-limit above.                   See --match-limit above.
586    
587         -s, --no-messages         -s, --no-messages
588                   Suppress  error  messages  about  non-existent  or unreadable                   Suppress error  messages  about  non-existent  or  unreadable
589                   files. Such files are quietly skipped.  However,  the  return                   files.  Such  files  are quietly skipped. However, the return
590                   code is still 2, even if matches were found in other files.                   code is still 2, even if matches were found in other files.
591    
592         -u, --utf-8         -u, --utf-8
593                   Operate  in UTF-8 mode. This option is available only if PCRE                   Operate in UTF-8 mode. This option is available only if  PCRE
594                   has been compiled with UTF-8 support. Both patterns and  sub-                   has been compiled with UTF-8 support. All patterns (including
595                   ject lines must be valid strings of UTF-8 characters.                   those for any --exclude and --include options) and  all  sub-
596                     ject  lines  that  are scanned must be valid strings of UTF-8
597                     characters.
598    
599         -V, --version         -V, --version
600                   Write  the  version  numbers of pcregrep and the PCRE library                   Write the version numbers of pcregrep and the PCRE library to
601                   that is being used to the standard error stream.                   the  standard output and then exit. Anything else on the com-
602                     mand line is ignored.
603    
604         -v, --invert-match         -v, --invert-match
605                   Invert the sense of the match, so that  lines  which  do  not                   Invert the sense of the match, so that  lines  which  do  not
# Line 548  OPTIONS Line 607  OPTIONS
607    
608         -w, --word-regex, --word-regexp         -w, --word-regex, --word-regexp
609                   Force the patterns to match only whole words. This is equiva-                   Force the patterns to match only whole words. This is equiva-
610                   lent to having \b at the start and end of the pattern.                   lent to having \b at the start and end of the  pattern.  This
611                     option  applies only to the patterns that are matched against
612                     the contents of files; it does not apply to  patterns  speci-
613                     fied by any of the --include or --exclude options.
614    
615         -x, --line-regex, --line-regexp         -x, --line-regex, --line-regexp
616                   Force the patterns to be anchored (each must  start  matching                   Force  the  patterns to be anchored (each must start matching
617                   at  the beginning of a line) and in addition, require them to                   at the beginning of a line) and in addition, require them  to
618                   match entire lines. This is equivalent  to  having  ^  and  $                   match  entire  lines.  This  is  equivalent to having ^ and $
619                   characters at the start and end of each alternative branch in                   characters at the start and end of each alternative branch in
620                   every pattern.                   every  pattern. This option applies only to the patterns that
621                     are matched against the contents of files; it does not  apply
622                     to  patterns  specified  by any of the --include or --exclude
623                     options.
624    
625    
626  ENVIRONMENT VARIABLES  ENVIRONMENT VARIABLES
# Line 569  ENVIRONMENT VARIABLES Line 634  ENVIRONMENT VARIABLES
634  NEWLINES  NEWLINES
635    
636         The  -N (--newline) option allows pcregrep to scan files with different         The  -N (--newline) option allows pcregrep to scan files with different
637         newline conventions from the default.  However,  the  setting  of  this         newline conventions from the default. Any parts of the input files that
638         option  does not affect the way in which pcregrep writes information to         are  written  to the standard output are copied identically, with what-
639         the standard error and output streams. It uses the  string  "\n"  in  C         ever newline sequences they have in the input. However, the setting  of
640         printf()  calls  to  indicate newlines, relying on the C I/O library to         this  option  does  not affect the interpretation of files specified by
641         convert this to an appropriate sequence if the  output  is  sent  to  a         the -f, --exclude-from, or --include-from options, which are assumed to
642         file.         use  the  operating  system's  standard  newline  sequence, nor does it
643           affect the way in which pcregrep writes informational messages  to  the
644           standard error and output streams. For these it uses the string "\n" to
645           indicate newlines, relying on the C I/O library to convert this  to  an
646           appropriate sequence.
647    
648    
649  OPTIONS COMPATIBILITY  OPTIONS COMPATIBILITY
# Line 583  OPTIONS COMPATIBILITY Line 652  OPTIONS COMPATIBILITY
652         in the GNU grep program. Any long option of the form --xxx-regexp  (GNU         in the GNU grep program. Any long option of the form --xxx-regexp  (GNU
653         terminology)  is also available as --xxx-regex (PCRE terminology). How-         terminology)  is also available as --xxx-regex (PCRE terminology). How-
654         ever, the --file-list, --file-offsets,  --include-dir,  --line-offsets,         ever, the --file-list, --file-offsets,  --include-dir,  --line-offsets,
655         --locale,  --match-limit,  -M, --multiline, -N, --newline, --recursion-         --locale,  --match-limit,  -M, --multiline, -N, --newline, --om-separa-
656         limit, -u, and --utf-8 options are specific to pcregrep, as is the  use         tor, --recursion-limit, -u, and --utf-8 options are specific  to  pcre-
657         of the --only-matching option with a capturing parentheses number.         grep,  as  is  the  use  of the --only-matching option with a capturing
658           parentheses number.
659         Although  most  of the common options work the same way, a few are dif-  
660         ferent in pcregrep. For example, the --include option's argument  is  a         Although most of the common options work the same way, a few  are  dif-
661         glob  for  GNU grep, but a regular expression for pcregrep. If both the         ferent  in  pcregrep. For example, the --include option's argument is a
662         -c and -l options are given, GNU grep lists only  file  names,  without         glob for GNU grep, but a regular expression for pcregrep. If  both  the
663           -c  and  -l  options are given, GNU grep lists only file names, without
664         counts, but pcregrep gives the counts.         counts, but pcregrep gives the counts.
665    
666    
667  OPTIONS WITH DATA  OPTIONS WITH DATA
668    
669         There are four different ways in which an option with data can be spec-         There are four different ways in which an option with data can be spec-
670         ified.  If a short form option is used, the  data  may  follow  immedi-         ified.   If  a  short  form option is used, the data may follow immedi-
671         ately, or (with one exception) in the next command line item. For exam-         ately, or (with one exception) in the next command line item. For exam-
672         ple:         ple:
673    
674           -f/some/file           -f/some/file
675           -f /some/file           -f /some/file
676    
677         The exception is the -o option, which may appear with or without  data.         The  exception is the -o option, which may appear with or without data.
678         Because  of this, if data is present, it must follow immediately in the         Because of this, if data is present, it must follow immediately in  the
679         same item, for example -o3.         same item, for example -o3.
680    
681         If a long form option is used, the data may appear in the same  command         If  a long form option is used, the data may appear in the same command
682         line  item,  separated by an equals character, or (with two exceptions)         line item, separated by an equals character, or (with  two  exceptions)
683         it may appear in the next command line item. For example:         it may appear in the next command line item. For example:
684    
685           --file=/some/file           --file=/some/file
686           --file /some/file           --file /some/file
687    
688         Note, however, that if you want to supply a file name beginning with  ~         Note,  however, that if you want to supply a file name beginning with ~
689         as  data  in  a  shell  command,  and have the shell expand ~ to a home         as data in a shell command, and have the  shell  expand  ~  to  a  home
690         directory, you must separate the file name from the option, because the         directory, you must separate the file name from the option, because the
691         shell does not treat ~ specially unless it is at the start of an item.         shell does not treat ~ specially unless it is at the start of an item.
692    
693         The  exceptions  to the above are the --colour (or --color) and --only-         The exceptions to the above are the --colour (or --color)  and  --only-
694         matching options, for which the data  is  optional.  If  one  of  these         matching  options,  for  which  the  data  is optional. If one of these
695         options  does  have  data, it must be given in the first form, using an         options does have data, it must be given in the first  form,  using  an
696         equals character. Otherwise pcregrep will assume that it has no data.         equals character. Otherwise pcregrep will assume that it has no data.
697    
698    
699  MATCHING ERRORS  MATCHING ERRORS
700    
701         It is possible to supply a regular expression that takes  a  very  long         It  is  possible  to supply a regular expression that takes a very long
702         time  to  fail  to  match certain lines. Such patterns normally involve         time to fail to match certain lines.  Such  patterns  normally  involve
703         nested indefinite repeats, for example: (a+)*\d when matched against  a         nested  indefinite repeats, for example: (a+)*\d when matched against a
704         line  of  a's  with  no  final  digit. The PCRE matching function has a         line of a's with no final digit.  The  PCRE  matching  function  has  a
705         resource limit that causes it to abort in these circumstances. If  this         resource  limit that causes it to abort in these circumstances. If this
706         happens, pcregrep outputs an error message and the line that caused the         happens, pcregrep outputs an error message and the line that caused the
707         problem to the standard error stream. If there are more  than  20  such         problem  to  the  standard error stream. If there are more than 20 such
708         errors, pcregrep gives up.         errors, pcregrep gives up.
709    
710         The  --match-limit  option  of  pcregrep can be used to set the overall         The --match-limit option of pcregrep can be used  to  set  the  overall
711         resource limit; there is a second option called --recursion-limit  that         resource  limit; there is a second option called --recursion-limit that
712         sets  a limit on the amount of memory (usually stack) that is used (see         sets a limit on the amount of memory (usually stack) that is used  (see
713         the discussion of these options above).         the discussion of these options above).
714    
715    
716  DIAGNOSTICS  DIAGNOSTICS
717    
718         Exit status is 0 if any matches were found, 1 if no matches were found,         Exit status is 0 if any matches were found, 1 if no matches were found,
719         and  2  for syntax errors, overlong lines, non-existent or inaccessible         and 2 for syntax errors, overlong lines, non-existent  or  inaccessible
720         files (even if matches were found in other files) or too many  matching         files  (even if matches were found in other files) or too many matching
721         errors. Using the -s option to suppress error messages about inaccessi-         errors. Using the -s option to suppress error messages about inaccessi-
722         ble files does not affect the return code.         ble files does not affect the return code.
723    
724    
725  SEE ALSO  SEE ALSO
726    
727         pcrepattern(3), pcretest(1).         pcrepattern(3), pcresyntax(3), pcretest(1).
728    
729    
730  AUTHOR  AUTHOR
# Line 666  AUTHOR Line 736  AUTHOR
736    
737  REVISION  REVISION
738    
739         Last updated: 04 March 2012         Last updated: 13 September 2012
740         Copyright (c) 1997-2012 University of Cambridge.         Copyright (c) 1997-2012 University of Cambridge.

Legend:
Removed from v.1193  
changed lines
  Added in v.1194

  ViewVC Help
Powered by ViewVC 1.1.5