Non-text files
We must address the question of what a file contains if it is
not a text file. Clearly we cannot use the text utilities described
above - not only will the file not be neatly split up into lines,
the characters contained within it will in general not be
printable. The file command gives us a rough
indication as to what sort of data a file (binary or text)
contains, but no more. If we need to know exactly what characters
are contained in a file that is not printable because it is in, for
example, binary format, od will give us precisely that
information. Thinking of a file as a sequence of bytes,
od lists each byte in a representation that can be
printed. The name stands for octal dump, and by
default it lists the bytes by their octal (base 8) codes
word-by-word (a word being typically 4 bytes).
Since computers use binary code internally, when in the past it
was necessary to examine data, it was often not possible to display
that data in any way other than as a representation of binary
numbers. One of the simplest ways of doing this was to group the
bits (binary digits) together in sequences of 3, consider each
3-bit sequence as representing a digit in base 8, and print out the
data as a string of octal digits. Hence we get the phrase octal
dump.
A more useful way to generate output is with option -t
c , whereby each byte is either printed as a 3-digit octal
number that is the code for that character, or the character itself
(if it is printable), or backslash followed by a character (if a
standard escape sequence is known, such as
\n for the newline character). For instance,
$ od -t c bintest
0000000 201 003 \n 013 \0 001 200 \0
0000010 \0 \0 @ \0 \0 \0 251 230
0000020 \0 \0 \0 \0 \0 \0
0000030 \0 \0 \0 \0 \0 \0 \0 \0
0000040 274 020 \0 320 003 240 @
0000050 222 003 240 D 225 * 002
0000060 224 002 240 004 224 002 @ \n
0000070 027 \0 \0 h 324 " 343 240
0000100 003 \0 \0 \b 302 \0 b \b
...
We see that the first byte in the file has code 201 in octal
(which is 129 in decimal). The third byte is a Newline
character. Just for comparison, a file called
hellotest , containing one line that is simply the word
Hello , would be displayed thus:
$ od -t c hellotest
0000000 H e l l o \n
0000006
The command has several possible options, which we do not list
here.
If you just want to examine a binary file quickly, and see what
printable strings exist within it, use command
strings . This can be useful if you have
compiled a program, such as one written in C, and that
program contains strings whose value is of interest to you
(filenames, for instance). Going through the binary code with
od would be tedious.
A useful command we introduce at this point is
touch . This command has the effect of changing the
date and time that its file arguments were last modified - they are
altered to the current date and time. If the files that are its
arguments do not currently exist, they are created with size 0;
touch is useful to create a file if you haven't yet
decided what to put in it, but want it to be there. This might
happen during the development phase of a program. It is also useful
to indicate that the file is in some sense 'up-to-date'.
|