Files
Overview
The UNIX directory
hierarchy
Filesystems
Manipulating files
Creating
directories
Creating files
links
'Dot' files
Protecting files
Groups
File access
control
Changing
privileges
File
contents
Text files
Comparing
files
Filtering
files
Non-text files
Printing
files
File archives and file
compression
Other relevant commands
Summary
Exercises
|
The UNIX directory hierarchy
A typical UNIX system has many users and usernames. The machine
stores large numbers of programs and datasets that are either
'system' files (required for the running of UNIX) or files for the
benefit of the system's users (such as the UNIX commands we discuss
in this book). In addition, each user has their own collection of
files. On a large UNIX system it would not be unreasonable to
expect to find millions of files occupying thousands of
gigabytes of space (where a gigabyte is a unit of
storage equal to 1024 megabytes).
If I choose to create a file called myfile (say),
it is unlikely that I will be the only user on the machine to have
chosen that particular name for a file. It would be unreasonable to
expect me to choose a filename instead of myfile that
was different to all files created by all the other users.
Therefore, UNIX must impose a structure on the filespace that will
make it easy to manage a large number of files. The solution
adopted is simple yet very powerful.
We can think of the available file storage for our machine as
partitioned into separate directories. At any
given time you can access files in one particular directory, which
we can think of as the current directory. You can
also 'move' between different directories and so change which is
current. A directory need not be a contiguous section of
disk, and might be fragmented. That is, the various files contained
within this storage area that we call a directory may in fact be
physically located on different parts of a disk, or even on
completely different disks or storage devices. This does not matter
to the user - the logical structure of the machine's memory is
important, not how it is physically implemented. In order for the
machine to know how to find the data in these directories, each has
a file, called dot and referred to by the 'dot'
symbol (. ) that stores information about that
directory (such as which files are stored within it, how big they
are, and precisely where on disk they are stored). The word
directory is also used to describe a file such as
dot, which contains the vital statistics for a directory
storage area. Since the physical layout of a directory is not
important to us, this dual meaning for the word presents us with no
ambiguity.
Within a directory are files, some of which may themselves be
directories. Directories are organised in a
tree-like structure. At the base of the tree is a
directory whose UNIX name is '/ ' ('root'). So, we
might have the following situation:
In each directory, in addition to the file dot, is a
file called dotdot, referred to by the symbol
'.. ', which is a synonym for the
parent of that directory in the tree. Since a file
dot and a file dotdot exist in each and every
directory, we do not usually mention them when describing a UNIX
directory hierarchy. There are two means by which we may refer to
the name of a file. Either we can name it relative
to our current directory, in which case we need only use
its simple name, such as myfile . Alternatively, we can
use its absolute filename relative to the
root. In this latter case, its name commences with a
/ ('slash' or 'solidus'), followed by the intervening
directories between the root and the file separated by
/ s, and finally with the filename. Thus in the above
tree, file myfile has absolute name
/home/ugrad/chris/myfile . If a filename commences with
the character / then it is an absolute name, otherwise
it is relative. Each file thus has a unique
absolute filename. Moreover, since these filenames can be as long
as required (within reason - each system has a limit) and the
depth of the tree can be as great as needed, we can cope
with a UNIX system containing as many files as desired. Since the
current directory has several names, there will be several names
for an individual file; if the current directory is
/home/ugrad/chris then the following names all refer
to the same file, since you can insert /. after any
intermediate directory name without affecting the meaning:
/home/ugrad/chris/myfile
myfile
./myfile
../chris/myfile
././././././myfile
../../../home/ugrad/./chris/myfile
When logged in to the machine, you are always in a current
directory somewhere. When you initially log in, you start in your
home directory in which you can create your own
files. This directory has a synonym, ~ (tilde), which
you can use whenever you need to refer to your home directory. To
find your current location within the file system use the command
pwd ('print working directory'). For example,
$ pwd
/home/ugrad/chris
$
It is not always convenient to have your home directory as the
current directory, since this might involve much typing of absolute
filenames if you wish to access a file elsewhere. By means of the
command cd ('change directory') you can move around
the filesystem. By typing cd followed by the name of a
directory, you can make the directory become the current directory
(if it exists - if not, an error message will be output and your
current directory will not change). For instance, to move to user
sam 's home directory, and then to a non-existent
directory called /squiggle :
$ cd /home/ugrad/sam
$ cd /squiggle
/squiggle: No such file or directory
You may also want to know what files exist on the machine. The
command ls ('list') which we have already met will
accomplish this. By default, ls lists the files in the
current directory; if, however, you give ls one
argument that is the name of a directory (either relative or
absolute) the files in that directory will be listed. For
instance:
$ ls /
bin etc tmp usr lib home
$ cd /bin
$ pwd
/bin
$ ls
date ls man
$
Try this on your own machine. The output will not look exactly
the same, and there will be many more files that are listed. If you
give ls an argument that is an ordinary file, not a
directory, just that filename will be displayed. Do not be afraid
of 'getting lost' by changing to different directories - you can
always return to your home directory by typing cd with
no arguments (alternatively cd ~ ). Since ~ always
refers to your home directory, you can always refer to files
relative to that directory, so if ~ is
/home/ugrad/chris , then
/home/ugrad/chris/myfile could equally well be
referred to as ~/myfile If you follow ~
by the name of the user, it refers to that user's home directory -
so if you are chris then ~ is equivalent
to ~chris , and sam has home directory
~sam .
Worked example
What files does sam have in their home
directory?
Solution: Use ls followed by the name
of sam 's home directory:
$ ls sam
When a file is created, space to store it is found on the
machine. That space is given a unique number, called an
inode (pronounced 'eye-node'), which remains with
that file until it is eventually deleted. At creation, the file is
also given a name. The file is created in a
directory, and at creation the directory is updated so that it
contains the name of the file and the inode where that file is
stored.
|