1. Computing & Technology

Introduction to Linux

From Machtelt Garrels

timestamp on the initial backup file is shown. Then a new file is created, upon which we take a new backup, containing only this new file:


   

 
jimmy:~> tar cvpf /var/tmp/javaproggies.tar java/*.java
 java/btw.java
 java/error.java
 java/hello.java
 java/income2.java
 java/income.java
 java/inputdevice.java
 java/input.java
 java/master.java
 java/method1.java
 java/mood.java
 java/moodywaitress.java
 java/test3.java
 java/TestOne.java
 java/TestTwo.java
 java/Vehicle.java
 
 jimmy:~> ls -l /var/tmp/javaproggies.tar
 -rw-rw-r-- 1 jimmy jimmy 10240 Jan 21 11:58 /var/tmp/javaproggies.tar
 
 jimmy:~> touch java/newprog.java
 
 jimmy:~> tar -N /var/tmp/javaproggies.tar \
 -cvp /var/tmp/incremental1-javaproggies.tar java/*.java 2> /dev/null
 java/newprog.java
 
 jimmy:~> cd /var/tmp/
 
 jimmy:~> tar xvf incremental1-javaproggies.tar
 java/newprog.java
 

Standard errors are redirected to /dev/null . If you don't do this, tar will print a message for each unchanged file, telling you it won't be dumped.

This way of working has the disadvantage that it looks at timestamps on files. Say that you download an archive into the directory containing your backups, and the archive contains files that have been created two years ago. When checking the timestamps of those files against the timestamp on the initial archive, the new files will actually seem old to tar , and will not be included in an incremental backup made using the -N option.

A better choice would be the -g option, which will create a list of files to backup. When making incremental backups, files are checked against this list. This is how it works:


   

 
jimmy:~> tar cvpf work-20030121.tar -g snapshot-20030121 work/
 work/
 work/file1
 work/file2
 work/file3
 
 jimmy:~> file snapshot-20030121
 snapshot-20030121: ASCII text
 

The next day, user jimmy works on file3 a bit more, and creates file4 . At the end of the day, he makes a new backup:


   

 
jimmy:~> tar cvpf work-20030122.tar -g snapshot-20030121 work/
 work/
 work/file3
 work/file4
 

These are some very simple examples, but you could also use this kind of command in a cronjob (see Section 4.4.4 ), which specifies for instance a snapshot file for the weekly backup and one for the daily backup. Snapshot files should be replaced when taking full backups, in that case.

More information can be found in the tar documentation.


       The real stuff
        

As you could probably notice, tar is OK when we are talking about a simple directory, a set of files that belongs together. There are tools that are easier to manage, however, when you want to archive entire partitions or disks or larger projects. We just explain about tar here because it is a very popular tool for distributing archives. It will happen quite often that you need to install a software that comes in a so-called "compressed tar ball" . See Section 9.3 for an easier way to perform regular backups.

9.1.1.3. Compressing and unpacking with gzip or bzip2

Data, including tarballs, can be compressed using zip tools. The gzip command will add the suffix .gz to the file name and remove the original file.


   

 
jimmy:~> ls -la | grep tar
 -rw-rw-r-- 1 jimmy jimmy 61440 Jun 6 14:08 images-without-dir.tar
 
 jimmy:~> gzip images-without-dir.tar 
 
 jimmy:~> ls -la images-without-dir.tar.gz 
 -rw-rw-r-- 1 jimmy jimmy 50562 Jun 6 14:08

©2012 About.com. All rights reserved.

A part of The New York Times Company.