File archives and file compression
It will often be necessary to take a copy of a complete
directory, either for the purpose of storing it in a safe place (a
'backup') in case the computer system 'crashes', or to send it to a
different computer system. There are two particular problems that
utilities such as cp are unable to address. First,
different machines and operating systems store data in different
formats, and it may be necessary to convert the format in which the
files in the directory are stored. Second, cp does not
handle links.
There have historically been two commands, tar
('tape archive') and cpio ('copy in out'), that have
been used. Both work by copying all the files in the directory,
together with data describing the structure of the directory, into
a single file known as an archive. Unfortunately,
both tar and cpio work differently and
produce archives in different formats. Although tar
was used much more extensively than cpio , it was felt
necessary to create a completely new command that would perform the
functions of both rather than try to update tar so
that it would also do everything cpio would do.
Neither tar nor cpio became part of
POSIX, but a new command pax ('portable arch\-ive
ex\-ch\-ange') has been written. We give a couple of examples
illustrating both pax and tar , but note
that pax is not found on all Linux systems.
To create a new archive, give pax the argument
-w ('write') or tar the argument
-c ('create'). The archive file will be sent to
standard output. So to archive the contents of the current
directory to the tape drive /dev/rst8 , either of the
following will work:
$ tar -c . >/dev/rst8
$ pax -w . >/dev/rst8
Alternatively, you can redirect the output to a file. To extract
the contents of an archive, the standard input to pax
or tar should be redirected from the archive,
pax requires argument -r ('read') and
tar argument -x ('extract'). Naturally,
when unpacking an archive, you don't want to overwrite any files or
directories that you have already created. It is a good idea to
check the contents of an archive by means of the -t
option to both tar and pax , which simply
causes the names of the files in the archive to be listed.
Having multiple copies of directories - whether 'real' or
archived - is bound to take up space. If you have created an
archive - mydir.pax , say - you can
compress the file and reduce its size, by means of
the command compress (not a POSIX command). This
creates a file mydir.pax.Z (note the .Z
suffix) and deletes mydir.pax ; the file
mydir.pax.Z will have a smaller size than
mydir.pax . The actual reduction in file size depends
on what the file to be compressed contains, but is typically a
factor of between 0.5 and 0.2. For example:
$ ls
mydir.pax
$ wc -c mydir.pax
206336
$ compress mydir.pax
$ ls
mydir.pax.Z
$ wc -c mydir.pax.Z
89473
To reverse the compression, use the command
uncompress . If you have stored any large files that
you do not use on a regular basis, you may wish to compress
them.
Worked example
Copy the contents of your current directory to
/tmp/backup preserving all links.
Solution: Using pax -w we can create
a new archive; store this in a temporary file, create
/tmp/backup , change directory to
/tmp/backup , and read the archive.
$ pax -w . >/tmp/backup.pax
$ mkdir /tmp/backup
$ cd /tmp/backup
$ pax -r </tmp/backup.pax
|