1. Computing

Discuss in my forum

How to Write grep Queries

Commands, Syntax, and Examples

By

The grep command can be viewed as a simplified or specialized database query, where the database consists of plain text files and each line represents a record. The grep command is used to retrieve those lines (records) from a file that match the regular expression specified as part of the command.

Let's say you have a list of people specified by their first, middle, and last names, and you want to find all individuals that have first name "Elvis" and last name "Travolta" and any middle name. For this task you could use the following regular expression as search string:

Elvis .* Travolta

The period matches any character, and the '*' (star) means: match the preceding character (in this case the '.') as many times as necessary to make the regular expression match the line. If the star follows an expression enclosed in parentheses, that expression is matched as many times as necessary.

To illustrate the use of this regular expression as part of a grep command, let's assume the name of the file containing the list is guests.txt. Then the grep command would look like this:

grep 'Elvis .* Travolta' guests.txt

The general syntax of the grep command is

grep flags regular-expression file-name

If you are only interested in the number of lines that match the specified regular expressions, you can use the -c flag. For example,

grep -c 'Elvis .* Travolta' guests.txt 

This would tell you, how many Elvis Travoltas are on the guest list. With the -v flag you effectively query for the complement, that is, all lines that do not match the specified regular expression. For example,

grep -v 'Elvis .* Travolta' guests.txt 

You can combine flags (also called "options") by listing all the option letters after the dash as in this example:

grep -vc 'Elvis .* Travolta' guests.txt 

Frequently you don't know which of the letters of the words you are searching for are in upper case. With the "-i" flag you can make your query case insensitive, as in this example:

grep -i 'july.*2003' meetings.txt 

You can apply your search to multiple files using wild card characters in the file name specification. For example the query

grep -i 'july.*2003' class*.txt 

will find all lines (in the current directory) that match the query string in all files whose name starts with "class" and ends with ".txt". The file names will be added at the front of lines being printed to the output.

In order to include all subdirectories in the search you can add recursion with the "-r" flag:

grep -ri 'july.*2003' class*.txt 

It is also possible to explicitly exclude groups of files from the search. The following example applies the search to all files in the currently directory and its subdirectories, except for files with the extension "doc":

grep -ri 'july.*2003' --exclude="\.doc" * 

The "-l" option (lower case "L") will only list the names of the files that contain a line matching the query. The "-L" flag only lists the names of the files, that do not contain such a line.

Some versions of grep, such as egrep, can process regular expressions with disjunctions (the logical 'or'). For example,

egrep 'Br(ow|au)n' guests.txt

will retrieve any lines that contain either "Brown" or "Braun". The vertical bar '|' and parentheses are used to list the alternative substrings. The command sed is frequently used in combination with grep , as it allows the modification of the selected lines.
  1. About.com
  2. Computing
  3. Linux
  4. Linux HowTos
  5. Bash How-To's
  6. How to Write grep Queries

©2014 About.com. All rights reserved.