/[pcre]/code/trunk/doc/html/pcregrep.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcregrep.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 285 by ph10, Tue Apr 17 08:22:40 2007 UTC revision 286 by ph10, Mon Dec 17 14:46:11 2007 UTC
# Line 15  man page, in case the conversion went wr Line 15  man page, in case the conversion went wr
15  <ul>  <ul>
16  <li><a name="TOC1" href="#SEC1">SYNOPSIS</a>  <li><a name="TOC1" href="#SEC1">SYNOPSIS</a>
17  <li><a name="TOC2" href="#SEC2">DESCRIPTION</a>  <li><a name="TOC2" href="#SEC2">DESCRIPTION</a>
18  <li><a name="TOC3" href="#SEC3">OPTIONS</a>  <li><a name="TOC3" href="#SEC3">SUPPORT FOR COMPRESSED FILES</a>
19  <li><a name="TOC4" href="#SEC4">ENVIRONMENT VARIABLES</a>  <li><a name="TOC4" href="#SEC4">OPTIONS</a>
20  <li><a name="TOC5" href="#SEC5">NEWLINES</a>  <li><a name="TOC5" href="#SEC5">ENVIRONMENT VARIABLES</a>
21  <li><a name="TOC6" href="#SEC6">OPTIONS COMPATIBILITY</a>  <li><a name="TOC6" href="#SEC6">NEWLINES</a>
22  <li><a name="TOC7" href="#SEC7">OPTIONS WITH DATA</a>  <li><a name="TOC7" href="#SEC7">OPTIONS COMPATIBILITY</a>
23  <li><a name="TOC8" href="#SEC8">MATCHING ERRORS</a>  <li><a name="TOC8" href="#SEC8">OPTIONS WITH DATA</a>
24  <li><a name="TOC9" href="#SEC9">DIAGNOSTICS</a>  <li><a name="TOC9" href="#SEC9">MATCHING ERRORS</a>
25  <li><a name="TOC10" href="#SEC10">SEE ALSO</a>  <li><a name="TOC10" href="#SEC10">DIAGNOSTICS</a>
26  <li><a name="TOC11" href="#SEC11">AUTHOR</a>  <li><a name="TOC11" href="#SEC11">SEE ALSO</a>
27  <li><a name="TOC12" href="#SEC12">REVISION</a>  <li><a name="TOC12" href="#SEC12">AUTHOR</a>
28    <li><a name="TOC13" href="#SEC13">REVISION</a>
29  </ul>  </ul>
30  <br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br>  <br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br>
31  <P>  <P>
# Line 47  without delimiters. For example: Line 48  without delimiters. For example:
48  </pre>  </pre>
49  If you attempt to use delimiters (for example, by surrounding a pattern with  If you attempt to use delimiters (for example, by surrounding a pattern with
50  slashes, as is common in Perl scripts), they are interpreted as part of the  slashes, as is common in Perl scripts), they are interpreted as part of the
51  pattern. Quotes can of course be used on the command line because they are  pattern. Quotes can of course be used to delimit patterns on the command line
52  interpreted by the shell, and indeed they are required if a pattern contains  because they are interpreted by the shell, and indeed they are required if a
53  white space or shell metacharacters.  pattern contains white space or shell metacharacters.
54  </P>  </P>
55  <P>  <P>
56  The first argument that follows any option settings is treated as the single  The first argument that follows any option settings is treated as the single
# Line 65  For example: Line 66  For example:
66  <pre>  <pre>
67    pcregrep some-pattern /file1 - /file3    pcregrep some-pattern /file1 - /file3
68  </pre>  </pre>
69  By default, each line that matches the pattern is copied to the standard  By default, each line that matches a pattern is copied to the standard
70  output, and if there is more than one file, the file name is output at the  output, and if there is more than one file, the file name is output at the
71  start of each line. However, there are options that can change how  start of each line, followed by a colon. However, there are options that can
72  <b>pcregrep</b> behaves. In particular, the <b>-M</b> option makes it possible to  change how <b>pcregrep</b> behaves. In particular, the <b>-M</b> option makes it
73  search for patterns that span line boundaries. What defines a line boundary is  possible to search for patterns that span line boundaries. What defines a line
74  controlled by the <b>-N</b> (<b>--newline</b>) option.  boundary is controlled by the <b>-N</b> (<b>--newline</b>) option.
75  </P>  </P>
76  <P>  <P>
77  Patterns are limited to 8K or BUFSIZ characters, whichever is the greater.  Patterns are limited to 8K or BUFSIZ characters, whichever is the greater.
78  BUFSIZ is defined in <b>&#60;stdio.h&#62;</b>.  BUFSIZ is defined in <b>&#60;stdio.h&#62;</b>. When there is more than one pattern
79    (specified by the use of <b>-e</b> and/or <b>-f</b>), each pattern is applied to
80    each line in the order in which they are defined, except that all the <b>-e</b>
81    patterns are tried before the <b>-f</b> patterns. As soon as one pattern matches
82    (or fails to match when <b>-v</b> is used), no further patterns are considered.
83    </P>
84    <P>
85    When <b>--only-matching</b>, <b>--file-offsets</b>, or <b>--line-offsets</b>
86    is used, the output is the part of the line that matched (either shown
87    literally, or as an offset). In this case, scanning resumes immediately
88    following the match, so that further matches on the same line can be found.
89    If there are multiple patterns, they are all tried on the remainder of the
90    line. However, patterns that follow the one that matched are not tried on the
91    earlier part of the line.
92  </P>  </P>
93  <P>  <P>
94  If the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variable is set,  If the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variable is set,
95  <b>pcregrep</b> uses the value to set a locale when calling the PCRE library.  <b>pcregrep</b> uses the value to set a locale when calling the PCRE library.
96  The <b>--locale</b> option can be used to override this.  The <b>--locale</b> option can be used to override this.
97  </P>  </P>
98  <br><a name="SEC3" href="#TOC1">OPTIONS</a><br>  <br><a name="SEC3" href="#TOC1">SUPPORT FOR COMPRESSED FILES</a><br>
99    <P>
100    It is possible to compile <b>pcregrep</b> so that it uses <b>libz</b> or
101    <b>libbz2</b> to read files whose names end in <b>.gz</b> or <b>.bz2</b>,
102    respectively. You can find out whether your binary has support for one or both
103    of these file types by running it with the <b>--help</b> option. If the
104    appropriate support is not present, files are treated as plain text. The
105    standard input is always so treated.
106    </P>
107    <br><a name="SEC4" href="#TOC1">OPTIONS</a><br>
108  <P>  <P>
109  <b>--</b>  <b>--</b>
110  This terminate the list of options. It is useful if the next item on the  This terminate the list of options. It is useful if the next item on the
# Line 152  are read as if they were ordinary files. Line 175  are read as if they were ordinary files.
175  of reading a directory like this is an immediate end-of-file.  of reading a directory like this is an immediate end-of-file.
176  </P>  </P>
177  <P>  <P>
178  <b>-e</b> <i>pattern</i>, <b>--regex=</b><i>pattern</i>,  <b>-e</b> <i>pattern</i>, <b>--regex=</b><i>pattern</i>, <b>--regexp=</b><i>pattern</i>
179  <b>--regexp=</b><i>pattern</i> Specify a pattern to be matched. This option can  Specify a pattern to be matched. This option can be used multiple times in
180  be used multiple times in order to specify several patterns. It can also be  order to specify several patterns. It can also be used as a way of specifying a
181  used as a way of specifying a single pattern that starts with a hyphen. When  single pattern that starts with a hyphen. When <b>-e</b> is used, no argument
182  <b>-e</b> is used, no argument pattern is taken from the command line; all  pattern is taken from the command line; all arguments are treated as file
183  arguments are treated as file names. There is an overall maximum of 100  names. There is an overall maximum of 100 patterns. They are applied to each
184  patterns. They are applied to each line in the order in which they are defined  line in the order in which they are defined until one matches (or fails to
185  until one matches (or fails to match if <b>-v</b> is used). If <b>-f</b> is used  match if <b>-v</b> is used). If <b>-f</b> is used with <b>-e</b>, the command line
186  with <b>-e</b>, the command line patterns are matched first, followed by the  patterns are matched first, followed by the patterns from the file, independent
187  patterns from the file, independent of the order in which these options are  of the order in which these options are specified. Note that multiple use of
188  specified. Note that multiple use of <b>-e</b> is not the same as a single  <b>-e</b> is not the same as a single pattern with alternatives. For example,
189  pattern with alternatives. For example, X|Y finds the first character in a line  X|Y finds the first character in a line that is X or Y, whereas if the two
190  that is X or Y, whereas if the two patterns are given separately,  patterns are given separately, <b>pcregrep</b> finds X if it is present, even if
191  <b>pcregrep</b> finds X if it is present, even if it follows Y in the line. It  it follows Y in the line. It finds Y only if there is no X in the line. This
192  finds Y only if there is no X in the line. This really matters only if you are  really matters only if you are using <b>-o</b> to show the part(s) of the line
193  using <b>-o</b> to show the portion of the line that matched.  that matched.
194  </P>  </P>
195  <P>  <P>
196  <b>--exclude</b>=<i>pattern</i>  <b>--exclude</b>=<i>pattern</i>
# Line 195  present; they are tested before the file Line 218  present; they are tested before the file
218  is taken from the command line; all arguments are treated as file names. There  is taken from the command line; all arguments are treated as file names. There
219  is an overall maximum of 100 patterns. Trailing white space is removed from  is an overall maximum of 100 patterns. Trailing white space is removed from
220  each line, and blank lines are ignored. An empty file contains no patterns and  each line, and blank lines are ignored. An empty file contains no patterns and
221  therefore matches nothing.  therefore matches nothing. See also the comments about multiple patterns versus
222    a single pattern with alternatives in the description of <b>-e</b> above.
223    </P>
224    <P>
225    <b>--file-offsets</b>
226    Instead of showing lines or parts of lines that match, show each match as an
227    offset from the start of the file and a length, separated by a comma. In this
228    mode, no context is shown. That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b>
229    options are ignored. If there is more than one match in a line, each of them is
230    shown separately. This option is mutually exclusive with <b>--line-offsets</b>
231    and <b>--only-matching</b>.
232  </P>  </P>
233  <P>  <P>
234  <b>-H</b>, <b>--with-filename</b>  <b>-H</b>, <b>--with-filename</b>
# Line 215  name without a space. Line 248  name without a space.
248  </P>  </P>
249  <P>  <P>
250  <b>--help</b>  <b>--help</b>
251  Output a brief help message and exit.  Output a help message, giving brief details of the command options and file
252    type support, and then exit.
253  </P>  </P>
254  <P>  <P>
255  <b>-i</b>, <b>--ignore-case</b>  <b>-i</b>, <b>--ignore-case</b>
# Line 249  are being output. If not supplied, "(sta Line 283  are being output. If not supplied, "(sta
283  short form for this option.  short form for this option.
284  </P>  </P>
285  <P>  <P>
286    <b>--line-offsets</b>
287    Instead of showing lines or parts of lines that match, show each match as a
288    line number, the offset from the start of the line, and a length. The line
289    number is terminated by a colon (as usual; see the <b>-n</b> option), and the
290    offset and length are separated by a comma. In this mode, no context is shown.
291    That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b> options are ignored. If there is
292    more than one match in a line, each of them is shown separately. This option is
293    mutually exclusive with <b>--file-offsets</b> and <b>--only-matching</b>.
294    </P>
295    <P>
296  <b>--locale</b>=<i>locale-name</i>  <b>--locale</b>=<i>locale-name</i>
297  This option specifies a locale to be used for pattern matching. It overrides  This option specifies a locale to be used for pattern matching. It overrides
298  the value in the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variables. If no  the value in the <b>LC_ALL</b> or <b>LC_CTYPE</b> environment variables. If no
# Line 293  being scanned does not agree with the co Line 337  being scanned does not agree with the co
337  <b>-n</b>, <b>--line-number</b>  <b>-n</b>, <b>--line-number</b>
338  Precede each output line by its line number in the file, followed by a colon  Precede each output line by its line number in the file, followed by a colon
339  and a space for matching lines or a hyphen and a space for context lines. If  and a space for matching lines or a hyphen and a space for context lines. If
340  the filename is also being output, it precedes the line number.  the filename is also being output, it precedes the line number. This option is
341    forced if <b>--line-offsets</b> is used.
342  </P>  </P>
343  <P>  <P>
344  <b>-o</b>, <b>--only-matching</b>  <b>-o</b>, <b>--only-matching</b>
345  Show only the part of the line that matched a pattern. In this mode, no  Show only the part of the line that matched a pattern. In this mode, no
346  context is shown. That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b> options are  context is shown. That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b> options are
347  ignored.  ignored. If there is more than one match in a line, each of them is shown
348    separately. If <b>-o</b> is combined with <b>-v</b> (invert the sense of the
349    match to find non-matching lines), no output is generated, but the return code
350    is set appropriately. This option is mutually exclusive with
351    <b>--file-offsets</b> and <b>--line-offsets</b>.
352  </P>  </P>
353  <P>  <P>
354  <b>-q</b>, <b>--quiet</b>  <b>-q</b>, <b>--quiet</b>
# Line 348  a line) and in addition, require them to Line 397  a line) and in addition, require them to
397  equivalent to having ^ and $ characters at the start and end of each  equivalent to having ^ and $ characters at the start and end of each
398  alternative branch in every pattern.  alternative branch in every pattern.
399  </P>  </P>
400  <br><a name="SEC4" href="#TOC1">ENVIRONMENT VARIABLES</a><br>  <br><a name="SEC5" href="#TOC1">ENVIRONMENT VARIABLES</a><br>
401  <P>  <P>
402  The environment variables <b>LC_ALL</b> and <b>LC_CTYPE</b> are examined, in that  The environment variables <b>LC_ALL</b> and <b>LC_CTYPE</b> are examined, in that
403  order, for a locale. The first one that is set is used. This can be overridden  order, for a locale. The first one that is set is used. This can be overridden
404  by the <b>--locale</b> option. If no locale is set, the PCRE library's default  by the <b>--locale</b> option. If no locale is set, the PCRE library's default
405  (usually the "C" locale) is used.  (usually the "C" locale) is used.
406  </P>  </P>
407  <br><a name="SEC5" href="#TOC1">NEWLINES</a><br>  <br><a name="SEC6" href="#TOC1">NEWLINES</a><br>
408  <P>  <P>
409  The <b>-N</b> (<b>--newline</b>) option allows <b>pcregrep</b> to scan files with  The <b>-N</b> (<b>--newline</b>) option allows <b>pcregrep</b> to scan files with
410  different newline conventions from the default. However, the setting of this  different newline conventions from the default. However, the setting of this
# Line 364  the standard error and output streams. I Line 413  the standard error and output streams. I
413  <b>printf()</b> calls to indicate newlines, relying on the C I/O library to  <b>printf()</b> calls to indicate newlines, relying on the C I/O library to
414  convert this to an appropriate sequence if the output is sent to a file.  convert this to an appropriate sequence if the output is sent to a file.
415  </P>  </P>
416  <br><a name="SEC6" href="#TOC1">OPTIONS COMPATIBILITY</a><br>  <br><a name="SEC7" href="#TOC1">OPTIONS COMPATIBILITY</a><br>
417  <P>  <P>
418  The majority of short and long forms of <b>pcregrep</b>'s options are the same  The majority of short and long forms of <b>pcregrep</b>'s options are the same
419  as in the GNU <b>grep</b> program. Any long option of the form  as in the GNU <b>grep</b> program. Any long option of the form
# Line 372  as in the GNU <b>grep</b> program. Any l Line 421  as in the GNU <b>grep</b> program. Any l
421  (PCRE terminology). However, the <b>--locale</b>, <b>-M</b>, <b>--multiline</b>,  (PCRE terminology). However, the <b>--locale</b>, <b>-M</b>, <b>--multiline</b>,
422  <b>-u</b>, and <b>--utf-8</b> options are specific to <b>pcregrep</b>.  <b>-u</b>, and <b>--utf-8</b> options are specific to <b>pcregrep</b>.
423  </P>  </P>
424  <br><a name="SEC7" href="#TOC1">OPTIONS WITH DATA</a><br>  <br><a name="SEC8" href="#TOC1">OPTIONS WITH DATA</a><br>
425  <P>  <P>
426  There are four different ways in which an option with data can be specified.  There are four different ways in which an option with data can be specified.
427  If a short form option is used, the data may follow immediately, or in the next  If a short form option is used, the data may follow immediately, or in the next
# Line 399  for which the data is optional. If this Line 448  for which the data is optional. If this
448  in the first form, using an equals character. Otherwise it will be assumed that  in the first form, using an equals character. Otherwise it will be assumed that
449  it has no data.  it has no data.
450  </P>  </P>
451  <br><a name="SEC8" href="#TOC1">MATCHING ERRORS</a><br>  <br><a name="SEC9" href="#TOC1">MATCHING ERRORS</a><br>
452  <P>  <P>
453  It is possible to supply a regular expression that takes a very long time to  It is possible to supply a regular expression that takes a very long time to
454  fail to match certain lines. Such patterns normally involve nested indefinite  fail to match certain lines. Such patterns normally involve nested indefinite
# Line 409  in these circumstances. If this happens, Line 458  in these circumstances. If this happens,
458  message and the line that caused the problem to the standard error stream. If  message and the line that caused the problem to the standard error stream. If
459  there are more than 20 such errors, <b>pcregrep</b> gives up.  there are more than 20 such errors, <b>pcregrep</b> gives up.
460  </P>  </P>
461  <br><a name="SEC9" href="#TOC1">DIAGNOSTICS</a><br>  <br><a name="SEC10" href="#TOC1">DIAGNOSTICS</a><br>
462  <P>  <P>
463  Exit status is 0 if any matches were found, 1 if no matches were found, and 2  Exit status is 0 if any matches were found, 1 if no matches were found, and 2
464  for syntax errors and non-existent or inacessible files (even if matches were  for syntax errors and non-existent or inacessible files (even if matches were
# Line 417  found in other files) or too many matchi Line 466  found in other files) or too many matchi
466  suppress error messages about inaccessble files does not affect the return  suppress error messages about inaccessble files does not affect the return
467  code.  code.
468  </P>  </P>
469  <br><a name="SEC10" href="#TOC1">SEE ALSO</a><br>  <br><a name="SEC11" href="#TOC1">SEE ALSO</a><br>
470  <P>  <P>
471  <b>pcrepattern</b>(3), <b>pcretest</b>(1).  <b>pcrepattern</b>(3), <b>pcretest</b>(1).
472  </P>  </P>
473  <br><a name="SEC11" href="#TOC1">AUTHOR</a><br>  <br><a name="SEC12" href="#TOC1">AUTHOR</a><br>
474  <P>  <P>
475  Philip Hazel  Philip Hazel
476  <br>  <br>
# Line 430  University Computing Service Line 479  University Computing Service
479  Cambridge CB2 3QH, England.  Cambridge CB2 3QH, England.
480  <br>  <br>
481  </P>  </P>
482  <br><a name="SEC12" href="#TOC1">REVISION</a><br>  <br><a name="SEC13" href="#TOC1">REVISION</a><br>
483  <P>  <P>
484  Last updated: 16 April 2007  Last updated: 17 December 2007
485  <br>  <br>
486  Copyright &copy; 1997-2007 University of Cambridge.  Copyright &copy; 1997-2007 University of Cambridge.
487  <br>  <br>

Legend:
Removed from v.285  
changed lines
  Added in v.286

  ViewVC Help
Powered by ViewVC 1.1.5