1. Technology
You can opt-out at any time. Please refer to our privacy policy for contact information.

Discuss in my forum

GNU/Linux Command-Line Tools Summary

By

20.4.2. Regular Expressions

Regular expressions are a type of globbing pattern used when working with text. They are used for any form of manipulation of multiple parts of text and by various programming languages that work with text. For more information on regular expressions refer to the manual page or try an online tutorial, for example IBM Developerworks using regular expressions . For the manual page type:

Type:


   


man 7 regex


       Regular expressions can be used by
        

Regular Expressions are used by grep (and can be used) by find and many other programs.


       Tip
        

If your regular expressions don't seem to be working then you probably need to use single quotation marks over the sentence and then use backslashes on every single special character.

  • . (dot)
  •    

    will match any single character , equivalent to ? (question mark) in standard wildcard expressions. Thus, "m.a" matches "mpa" and "mea" but not "ma" or "mppa".


  • \ (backslash)
  •    

    is used as an "escape" character, i.e. to protect a subsequent special character. Thus, "\\" searches for a backslash. Note you may need to use quotation marks and backslash(es).


  • .* (dot and asterisk)
  •    

    is used to match any string, equivalent to * in standard wildcards.


  • * (asterisk)
  •    

    the proceeding item is to be matched zero or more times. ie. n* will match n, nn, nnnn, nnnnnnn but not na or any other character.


  • ^ (caret)
  •    

    means "the beginning of the line". So "^a" means find a line starting with an "a".


  • $ (dollar sign)
  •    

    means "the end of the line". So "a$" means find a line ending with an "a".

    For example, this command searches the file myfile for lines starting with an "s" and ending with an "n", and prints them to the standard output (screen):


       

    
    cat myfile | grep '^s.*n$'
    


  • [ ] (square brackets)
  •    

    specifies a range. If you did m[a,o,u]m it can become: mam, mum, mom if you did: m[a-d]m it can become anything that starts and ends with m and has any character a to d inbetween. For example, these would work: mam, mbm, mcm, mdm. This kind of wildcard specifies an "or" relationship (you only need one to match).


  • |
  •    

    This wildcard makes a logical OR relationship between wildcards. This way you can search for something or something else (possibly using two different regular expressions). You may need to add a '\' (backslash) before this command to work, because the shell may attempt to interpret this as a pipe.


  • [^]
  •    

    This is the equivalent of [!] in standard wildcards. This performs a logical "not". This will match anything that is not listed within those square brackets. For example, rm myfile[^9] will remove all myfiles* (ie. myfiles1, myfiles2 etc) but won't remove a file with the number 9 anywhere within it's name.


* License

* GNU/Linux Command-Line Tools Summary Index

  1. About.com
  2. Technology
  3. Linux

©2014 About.com. All rights reserved.