/[pcre]/code/trunk/doc/pcregrep.txt
ViewVC logotype

Diff of /code/trunk/doc/pcregrep.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 87 by nigel, Sat Feb 24 21:41:21 2007 UTC revision 96 by nigel, Fri Mar 2 13:10:43 2007 UTC
# Line 14  DESCRIPTION Line 14  DESCRIPTION
14         pcregrep  searches  files  for  character  patterns, in the same way as         pcregrep  searches  files  for  character  patterns, in the same way as
15         other grep commands do, but it uses the PCRE regular expression library         other grep commands do, but it uses the PCRE regular expression library
16         to support patterns that are compatible with the regular expressions of         to support patterns that are compatible with the regular expressions of
17         Perl 5. See pcrepattern for a full description of syntax and  semantics         Perl 5. See pcrepattern(3) for a full description of syntax and  seman-
18         of the regular expressions that PCRE supports.         tics of the regular expressions that PCRE supports.
19    
20         Patterns,  whether  supplied on the command line or in a separate file,         Patterns,  whether  supplied on the command line or in a separate file,
21         are given without delimiters. For example:         are given without delimiters. For example:
# Line 44  DESCRIPTION Line 44  DESCRIPTION
44         dard  output, and if there is more than one file, the file name is out-         dard  output, and if there is more than one file, the file name is out-
45         put at the start of each line. However,  there  are  options  that  can         put at the start of each line. However,  there  are  options  that  can
46         change how pcregrep behaves. In particular, the -M option makes it pos-         change how pcregrep behaves. In particular, the -M option makes it pos-
47         sible to search for patterns that span line boundaries.         sible to search for patterns that span line boundaries. What defines  a
48           line boundary is controlled by the -N (--newline) option.
49    
50         Patterns are limited to 8K  or  BUFSIZ  characters,  whichever  is  the         Patterns  are  limited  to  8K  or  BUFSIZ characters, whichever is the
51         greater.  BUFSIZ is defined in <stdio.h>.         greater.  BUFSIZ is defined in <stdio.h>.
52    
53         If  the  LC_ALL  or LC_CTYPE environment variable is set, pcregrep uses         If the LC_ALL or LC_CTYPE environment variable is  set,  pcregrep  uses
54         the value to set a locale when calling the PCRE library.  The  --locale         the  value to set a locale when calling the PCRE library.  The --locale
55         option can be used to override this.         option can be used to override this.
56    
57    
58  OPTIONS  OPTIONS
59    
60         --        This  terminate the list of options. It is useful if the next         --        This terminate the list of options. It is useful if the  next
61                   item on the command line starts with a hyphen but is  not  an                   item  on  the command line starts with a hyphen but is not an
62                   option.  This allows for the processing of patterns and file-                   option. This allows for the processing of patterns and  file-
63                   names that start with hyphens.                   names that start with hyphens.
64    
65         -A number, --after-context=number         -A number, --after-context=number
66                   Output number lines of context after each matching  line.  If                   Output  number  lines of context after each matching line. If
67                   filenames and/or line numbers are being output, a hyphen sep-                   filenames and/or line numbers are being output, a hyphen sep-
68                   arator is used instead of a colon for the  context  lines.  A                   arator  is  used  instead of a colon for the context lines. A
69                   line  containing  "--" is output between each group of lines,                   line containing "--" is output between each group  of  lines,
70                   unless they are in fact contiguous in  the  input  file.  The                   unless  they  are  in  fact contiguous in the input file. The
71                   value  of number is expected to be relatively small. However,                   value of number is expected to be relatively small.  However,
72                   pcregrep guarantees to have up to 8K of following text avail-                   pcregrep guarantees to have up to 8K of following text avail-
73                   able for context output.                   able for context output.
74    
75         -B number, --before-context=number         -B number, --before-context=number
76                   Output  number lines of context before each matching line. If                   Output number lines of context before each matching line.  If
77                   filenames and/or line numbers are being output, a hyphen sep-                   filenames and/or line numbers are being output, a hyphen sep-
78                   arator  is  used  instead of a colon for the context lines. A                   arator is used instead of a colon for the  context  lines.  A
79                   line containing "--" is output between each group  of  lines,                   line  containing  "--" is output between each group of lines,
80                   unless  they  are  in  fact contiguous in the input file. The                   unless they are in fact contiguous in  the  input  file.  The
81                   value of number is expected to be relatively small.  However,                   value  of number is expected to be relatively small. However,
82                   pcregrep guarantees to have up to 8K of preceding text avail-                   pcregrep guarantees to have up to 8K of preceding text avail-
83                   able for context output.                   able for context output.
84    
85         -C number, --context=number         -C number, --context=number
86                   Output number lines of context both  before  and  after  each                   Output  number  lines  of  context both before and after each
87                   matching  line.  This is equivalent to setting both -A and -B                   matching line.  This is equivalent to setting both -A and  -B
88                   to the same value.                   to the same value.
89    
90         -c, --count         -c, --count
91                   Do not output individual lines; instead just output  a  count                   Do  not  output individual lines; instead just output a count
92                   of the number of lines that would otherwise have been output.                   of the number of lines that would otherwise have been output.
93                   If several files are given, a count is  output  for  each  of                   If  several  files  are  given, a count is output for each of
94                   them. In this mode, the -A, -B, and -C options are ignored.                   them. In this mode, the -A, -B, and -C options are ignored.
95    
96         --colour, --color         --colour, --color
97                   If this option is given without any data, it is equivalent to                   If this option is given without any data, it is equivalent to
98                   "--colour=auto".  If data is required, it must  be  given  in                   "--colour=auto".   If  data  is required, it must be given in
99                   the same shell item, separated by an equals sign.                   the same shell item, separated by an equals sign.
100    
101         --colour=value, --color=value         --colour=value, --color=value
102                   This  option specifies under what circumstances the part of a                   This option specifies under what circumstances the part of  a
103                   line that matched a pattern should be coloured in the output.                   line that matched a pattern should be coloured in the output.
104                   The  value may be "never" (the default), "always", or "auto".                   The value may be "never" (the default), "always", or  "auto".
105                   In the latter case, colouring happens only  if  the  standard                   In  the  latter  case, colouring happens only if the standard
106                   output  is  connected to a terminal. The colour can be speci-                   output is connected to a terminal. The colour can  be  speci-
107                   fied by setting the environment variable  PCREGREP_COLOUR  or                   fied  by  setting the environment variable PCREGREP_COLOUR or
108                   PCREGREP_COLOR. The value of this variable should be a string                   PCREGREP_COLOR. The value of this variable should be a string
109                   of two numbers, separated by a semicolon.   They  are  copied                   of  two  numbers,  separated by a semicolon.  They are copied
110                   directly into the control string for setting colour on a ter-                   directly into the control string for setting colour on a ter-
111                   minal, so it is your responsibility to ensure that they  make                   minal,  so it is your responsibility to ensure that they make
112                   sense.  If  neither  of the environment variables is set, the                   sense. If neither of the environment variables  is  set,  the
113                   default is "1;31", which gives red.                   default is "1;31", which gives red.
114    
115         -D action, --devices=action         -D action, --devices=action
116                   If an input path is  not  a  regular  file  or  a  directory,                   If  an  input  path  is  not  a  regular file or a directory,
117                   "action"  specifies  how  it is to be processed. Valid values                   "action" specifies how it is to be  processed.  Valid  values
118                   are "read" (the default) or "skip" (silently skip the  path).                   are  "read" (the default) or "skip" (silently skip the path).
119    
120         -d action, --directories=action         -d action, --directories=action
121                   If an input path is a directory, "action" specifies how it is                   If an input path is a directory, "action" specifies how it is
122                   to be processed.  Valid  values  are  "read"  (the  default),                   to  be  processed.   Valid  values  are "read" (the default),
123                   "recurse"  (equivalent to the -r option), or "skip" (silently                   "recurse" (equivalent to the -r option), or "skip"  (silently
124                   skip the path). In the default case, directories are read  as                   skip  the path). In the default case, directories are read as
125                   if  they  were  ordinary files. In some operating systems the                   if they were ordinary files. In some  operating  systems  the
126                   effect of reading a directory like this is an immediate  end-                   effect  of reading a directory like this is an immediate end-
127                   of-file.                   of-file.
128    
129         -e pattern, --regex=pattern,         -e pattern, --regex=pattern,
130                   --regexp=pattern Specify a pattern to be matched. This option                   --regexp=pattern Specify a pattern to be matched. This option
131                   can be used multiple times in order to specify  several  pat-                   can  be  used multiple times in order to specify several pat-
132                   terns.  It  can  also be used as a way of specifying a single                   terns. It can also be used as a way of  specifying  a  single
133                   pattern that starts with a hyphen. When -e is used, no  argu-                   pattern  that starts with a hyphen. When -e is used, no argu-
134                   ment  pattern  is  taken from the command line; all arguments                   ment pattern is taken from the command  line;  all  arguments
135                   are treated as file names. There is an overall maximum of 100                   are treated as file names. There is an overall maximum of 100
136                   patterns. They are applied to each line in the order in which                   patterns. They are applied to each line in the order in which
137                   they are defined until one matches (or fails to match  if  -v                   they  are  defined until one matches (or fails to match if -v
138                   is  used).  If  -f is used with -e, the command line patterns                   is used). If -f is used with -e, the  command  line  patterns
139                   are matched first, followed by the patterns  from  the  file,                   are  matched  first,  followed by the patterns from the file,
140                   independent  of  the  order in which these options are speci-                   independent of the order in which these  options  are  speci-
141                   fied. Note that multiple use of -e is not the same as a  sin-                   fied.  Note that multiple use of -e is not the same as a sin-
142                   gle  pattern  with  alternatives.  For example, X|Y finds the                   gle pattern with alternatives. For  example,  X|Y  finds  the
143                   first character in a line that is X or Y, whereas if the  two                   first  character in a line that is X or Y, whereas if the two
144                   patterns  are  given  separately,  pcregrep  finds X if it is                   patterns are given separately, pcregrep  finds  X  if  it  is
145                   present, even if it follows Y in the line. It finds Y only if                   present, even if it follows Y in the line. It finds Y only if
146                   there  is  no  X in the line. This really matters only if you                   there is no X in the line. This really matters  only  if  you
147                   are using -o to show the portion of the line that matched.                   are using -o to show the portion of the line that matched.
148    
149         --exclude=pattern         --exclude=pattern
150                   When pcregrep is searching the files in a directory as a con-                   When pcregrep is searching the files in a directory as a con-
151                   sequence of the -r (recursive search) option, any files whose                   sequence of the -r (recursive search) option, any files whose
152                   names match the pattern are excluded. The pattern is  a  PCRE                   names  match  the pattern are excluded. The pattern is a PCRE
153                   regular expression. If a file name matches both --include and                   regular expression. If a file name matches both --include and
154                   --exclude, it is excluded. There is no short  form  for  this                   --exclude,  it  is  excluded. There is no short form for this
155                   option.                   option.
156    
157         -F, --fixed-strings         -F, --fixed-strings
158                   Interpret  each pattern as a list of fixed strings, separated                   Interpret each pattern as a list of fixed strings,  separated
159                   by newlines, instead of  as  a  regular  expression.  The  -w                   by  newlines,  instead  of  as  a  regular expression. The -w
160                   (match  as  a  word) and -x (match whole line) options can be                   (match as a word) and -x (match whole line)  options  can  be
161                   used with -F. They apply to each of the fixed strings. A line                   used with -F. They apply to each of the fixed strings. A line
162                   is selected if any of the fixed strings are found in it (sub-                   is selected if any of the fixed strings are found in it (sub-
163                   ject to -w or -x, if present).                   ject to -w or -x, if present).
164    
165         -f filename, --file=filename         -f filename, --file=filename
166                   Read a number of patterns from the file, one  per  line,  and                   Read  a  number  of patterns from the file, one per line, and
167                   match  them against each line of input. A data line is output                   match them against each line of input. A data line is  output
168                   if any of the patterns match it. The filename can be given as                   if any of the patterns match it. The filename can be given as
169                   "-" to refer to the standard input. When -f is used, patterns                   "-" to refer to the standard input. When -f is used, patterns
170                   specified on the command line using -e may also  be  present;                   specified  on  the command line using -e may also be present;
171                   they are tested before the file's patterns. However, no other                   they are tested before the file's patterns. However, no other
172                   pattern is taken from the command  line;  all  arguments  are                   pattern  is  taken  from  the command line; all arguments are
173                   treated  as  file  names.  There is an overall maximum of 100                   treated as file names. There is an  overall  maximum  of  100
174                   patterns. Trailing white space is removed from each line, and                   patterns. Trailing white space is removed from each line, and
175                   blank  lines  are ignored. An empty file contains no patterns                   blank lines are ignored. An empty file contains  no  patterns
176                   and therefore matches nothing.                   and therefore matches nothing.
177    
178         -H, --with-filename         -H, --with-filename
179                   Force the inclusion of the filename at the  start  of  output                   Force  the  inclusion  of the filename at the start of output
180                   lines  when searching a single file. By default, the filename                   lines when searching a single file. By default, the  filename
181                   is not shown in this case. For matching lines,  the  filename                   is  not  shown in this case. For matching lines, the filename
182                   is  followed  by  a  colon  and a space; for context lines, a                   is followed by a colon and a  space;  for  context  lines,  a
183                   hyphen separator is used. If a line number is also being out-                   hyphen separator is used. If a line number is also being out-
184                   put, it follows the file name without a space.                   put, it follows the file name without a space.
185    
186         -h, --no-filename         -h, --no-filename
187                   Suppress  the output filenames when searching multiple files.                   Suppress the output filenames when searching multiple  files.
188                   By default, filenames  are  shown  when  multiple  files  are                   By  default,  filenames  are  shown  when  multiple files are
189                   searched.  For  matching lines, the filename is followed by a                   searched. For matching lines, the filename is followed  by  a
190                   colon and a space; for context lines, a hyphen  separator  is                   colon  and  a space; for context lines, a hyphen separator is
191                   used.  If  a line number is also being output, it follows the                   used. If a line number is also being output, it  follows  the
192                   file name without a space.                   file name without a space.
193    
194         --help    Output a brief help message and exit.         --help    Output a brief help message and exit.
# Line 197  OPTIONS Line 198  OPTIONS
198    
199         --include=pattern         --include=pattern
200                   When pcregrep is searching the files in a directory as a con-                   When pcregrep is searching the files in a directory as a con-
201                   sequence  of  the  -r  (recursive  search) option, only those                   sequence of the -r  (recursive  search)  option,  only  those
202                   files whose names match the pattern are included. The pattern                   files whose names match the pattern are included. The pattern
203                   is  a  PCRE  regular  expression. If a file name matches both                   is a PCRE regular expression. If a  file  name  matches  both
204                   --include and --exclude, it is excluded. There  is  no  short                   --include  and  --exclude,  it is excluded. There is no short
205                   form for this option.                   form for this option.
206    
207         -L, --files-without-match         -L, --files-without-match
208                   Instead  of  outputting lines from the files, just output the                   Instead of outputting lines from the files, just  output  the
209                   names of the files that do not contain any lines  that  would                   names  of  the files that do not contain any lines that would
210                   have  been  output. Each file name is output once, on a sepa-                   have been output. Each file name is output once, on  a  sepa-
211                   rate line.                   rate line.
212    
213         -l, --files-with-matches         -l, --files-with-matches
214                   Instead of outputting lines from the files, just  output  the                   Instead  of  outputting lines from the files, just output the
215                   names of the files containing lines that would have been out-                   names of the files containing lines that would have been out-
216                   put. Each file name is  output  once,  on  a  separate  line.                   put.  Each  file  name  is  output  once, on a separate line.
217                   Searching  stops  as  soon  as  a matching line is found in a                   Searching stops as soon as a matching  line  is  found  in  a
218                   file.                   file.
219    
220         --label=name         --label=name
# Line 222  OPTIONS Line 223  OPTIONS
223                   input)" is used. There is no short form for this option.                   input)" is used. There is no short form for this option.
224    
225         --locale=locale-name         --locale=locale-name
226                   This option specifies a locale to be used for pattern  match-                   This  option specifies a locale to be used for pattern match-
227                   ing.  It  overrides the value in the LC_ALL or LC_CTYPE envi-                   ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-
228                   ronment variables.  If  no  locale  is  specified,  the  PCRE                   ronment  variables.  If  no  locale  is  specified,  the PCRE
229                   library's  default (usually the "C" locale) is used. There is                   library's default (usually the "C" locale) is used. There  is
230                   no short form for this option.                   no short form for this option.
231    
232         -M, --multiline         -M, --multiline
233                   Allow patterns to match more than one line. When this  option                   Allow  patterns to match more than one line. When this option
234                   is given, patterns may usefully contain literal newline char-                   is given, patterns may usefully contain literal newline char-
235                   acters and internal occurrences of ^ and  $  characters.  The                   acters  and  internal  occurrences of ^ and $ characters. The
236                   output  for  any one match may consist of more than one line.                   output for any one match may consist of more than  one  line.
237                   When this option is set, the PCRE library is called in  "mul-                   When  this option is set, the PCRE library is called in "mul-
238                   tiline"  mode.   There is a limit to the number of lines that                   tiline" mode.  There is a limit to the number of  lines  that
239                   can be matched, imposed by the way that pcregrep buffers  the                   can  be matched, imposed by the way that pcregrep buffers the
240                   input  file as it scans it. However, pcregrep ensures that at                   input file as it scans it. However, pcregrep ensures that  at
241                   least 8K characters or the rest of the document (whichever is                   least 8K characters or the rest of the document (whichever is
242                   the  shorter)  are  available for forward matching, and simi-                   the shorter) are available for forward  matching,  and  simi-
243                   larly the previous 8K characters (or all the previous charac-                   larly the previous 8K characters (or all the previous charac-
244                   ters,  if  fewer  than 8K) are guaranteed to be available for                   ters, if fewer than 8K) are guaranteed to  be  available  for
245                   lookbehind assertions.                   lookbehind assertions.
246    
247           -N newline-type, --newline=newline-type
248                     The  PCRE  library  supports  four  different conventions for
249                     indicating the ends of lines. They are  the  single-character
250                     sequences  CR  (carriage  return) and LF (linefeed), the two-
251                     character sequence CRLF, and an "any"  convention,  in  which
252                     any  Unicode  line  ending sequence is assumed to end a line.
253                     The Unicode sequences are the three just mentioned,  plus  VT
254                     (vertical  tab,  U+000B),  FF  (formfeed,  U+000C), NEL (next
255                     line, U+0085), LS (line separator, U+2028), and PS (paragraph
256                     separator, U+0029).
257    
258                     When  the  PCRE  library  is  built,  a  default  line-ending
259                     sequence  is  specified.   This  is  normally  the   standard
260                     sequence for the operating system. Unless otherwise specified
261                     by this option, pcregrep uses  the  library's  default.   The
262                     possible  values  for  this  option are CR, LF, CRLF, or ANY.
263                     This makes it possible to use pcregrep  on  files  that  have
264                     come  from  other environments without having to modify their
265                     line endings. If the data that  is  being  scanned  does  not
266                     agree  with  the  convention set by this option, pcregrep may
267                     behave in strange ways.
268    
269         -n, --line-number         -n, --line-number
270                   Precede each output line by its line number in the file, fol-                   Precede each output line by its line number in the file, fol-
271                   lowed  by  a colon and a space for matching lines or a hyphen                   lowed  by  a colon and a space for matching lines or a hyphen
# Line 305  ENVIRONMENT VARIABLES Line 328  ENVIRONMENT VARIABLES
328         library's default (usually the "C" locale) is used.         library's default (usually the "C" locale) is used.
329    
330    
331    NEWLINES
332    
333           The -N (--newline) option allows pcregrep to scan files with  different
334           newline  conventions  from  the  default.  However, the setting of this
335           option does not affect the way in which pcregrep writes information  to
336           the  standard  error  and  output streams. It uses the string "\n" in C
337           printf() calls to indicate newlines, relying on the C  I/O  library  to
338           convert  this  to  an  appropriate  sequence if the output is sent to a
339           file.
340    
341    
342  OPTIONS COMPATIBILITY  OPTIONS COMPATIBILITY
343    
344         The majority of short and long forms of pcregrep's options are the same         The majority of short and long forms of pcregrep's options are the same
# Line 362  DIAGNOSTICS Line 396  DIAGNOSTICS
396         not affect the return code.         not affect the return code.
397    
398    
399    SEE ALSO
400    
401           pcrepattern(3), pcretest(1).
402    
403    
404  AUTHOR  AUTHOR
405    
406         Philip Hazel         Philip Hazel
407         University Computing Service         University Computing Service
408         Cambridge CB2 3QG, England.         Cambridge CB2 3QH, England.
409    
410  Last updated: 23 January 2006  Last updated: 29 November 2006
411  Copyright (c) 1997-2006 University of Cambridge.  Copyright (c) 1997-2006 University of Cambridge.

Legend:
Removed from v.87  
changed lines
  Added in v.96

  ViewVC Help
Powered by ViewVC 1.1.5