The so-called wild card characters allow you to perform a rudimentary type of regular expression matching. For example, at the Bash shell prompt you can type "ls *.txt", which will list all files that end with ".txt", with the "*" matches any string of characters.
Regular expressions expand on this idea and enable you to specify almost any imaginable constraints between a sequence of characters.
Regular expressions are used in various commands (such as awk, and sed), software utilities, and programming languages. The syntax and functionality may vary, but the basic concepts are the same. The programming language Perl provides the most powerful implementation of regular expressions.
Let's start with reviewing regular expressions in the GNU/Linux utility grep, which is used to retrieve those lines from a file that match the specified regular expression.
Let's say you have a list of persons specified by their first, middle, and last names, and want to find all individuals that have first name "John" and last name "Travolta" and any middle name. For this task you could use the following regular expression as search string:
John .* Travolta
If the file name is guests.txt, the corresponding grep command would look like this:
grep 'John .* Travolta' guests.txt
grep 'John . Travolta' guests.txt
If the middle initial is followed by an actual period you would need to use a backslash in front of the period, so that it isn't interpreted with the special meaning of the regular expression:
grep 'John .\. Travolta' guests.txt
The characters '^' (caret) and '$' (dollar) can be used to specify the location of the start or end of the line in the regular expression. For example
grep '^John' guests.txt
grep 'Smith$' guests.txt
As mentioned above, you can use the backslash '\' prevent any such special characters to be interpreted as such. For example, if you want to use the '$' to match to a '$' in the input file, you would write it as '\$' in the regular expression.
Another useful syntactic element are square brackets '[' and ']', which can be used to list a specific set of characters, such at any of the characters in this set may match a character in the line being matched. For example, if you are searching for the name "Berkeley", but you are not sure if it is spelled with a "k" or a "c", you can use the regular expression
Ber[kc]eley
egrep 'Br(ow|au)n' guests.txt
Finally, instead of the star '*' you can use the notation {x,y} to specify how many times an expression many be repeated. For example, {3,7} means that the preceding character may occur 3 to 7 times in a row. In the command line the curly brackets '{' and '}' need to be escaped with the backslash '\'. For example,
grep 'Br\{3,7\}' file1

