Introducing UNIX and Linux |
AwkOverview |
Field and record separatorsThe fields in a record are normally separated by whitespace.
This is not always convenient. Suppose a file ( John 13 Sue 12 James Smith 15 James Jones 14 The number of fields on each line varies. This is a potential problem. Let us suppose we wish to write a simple Awk script to display John is 13 years old Sue is 12 years old James Smith is 15 years old James Jones is 14 years old There are several possible solutions. One that you will already be able to find checks the number of fields and performs a separate action each time: NF == 2 { printf "%s is %d years old\n", $1, $2 } NF == 3 { printf "%s %s is %d years old\n", $1, $2, $3 } This solution is fine if you know how many names a person is
likely to have - but it is not elegant since there is a lot of
duplication in the Awk script. If you were to allow persons with
many forenames to appear in the list the Awk script would become
unmanageable. Loops, such as John:13 Sue:12 James Smith:15 James Jones:14 so that a colon (say) was used to separate the names from the
numbers, then each line would have precisely two fields, and the
spaces in the names would not matter. We can instruct Awk to use a
different field separator to the usual whitespace
by resetting the value of the variable BEGIN { FS=":" } { printf "%s is %d years old\n", $1, $2 } The field separator can be any ERE, and can also be changed by
giving awk -F "[ ,:]+" On your UNIX system there should be a file called
chris:hi64MH4uhJiq2:1623:103:Chris Cringle:/home/ugrad/chris: sam:a8PyPVSiPVXT6:1628:103:Sam Smith:/home/ugrad/sam:/bin/sh jo:9gqrX4IOig7qs:1631:103:Jo Jones:/home/ugrad/jo:/bin/sh geo:58esMw4xFsZ9I:1422:97:George Green:/home/staff/geo:/bin/sh ... Each line contains seven colon-separated fields; these represent the following:
Some systems which 'hide' the encrypted passwords will also have
another mechanism for storing the data normally in
ypcat passwd and the data will be sent to standard output. Worked exampleUsing Awk and # As usual, make sure the script has one argument ... if [ $# -ne 1 ] then echo "findname requires one argument" exit 1 fi awk ' # Set field separator to : BEGIN { FS=":" } { # Is the first field the usercode? if ($1 == usercode) # If yes, print out field 5, the user's name printf "%s\n", $5 } ' usercode=$1 < /etc/passwd Just as we can specify what should separate fields within a
record, so we can specify what should separate records. Unless
otherwise specified, a record is a line of input, so the
record separator is the Newline
character. The special variable used to change this is
Worked exampleWrite an Awk script to read standard input containing a list of
company names and phone numbers, together with other information.
All companies in the input with the keyword Toytown Telecom Birmingham 0121 123 4567 Sells phones and answering machines % Sue, Grabbit and Runne Solicitors London 020 7999 9999 % Chopham, Sliceham and Son Anytown 234 family butchersSo with this data, the output would be: Chopham, Sliceham and Son Anytown 234 family butchers Solution: Set the record separator to a
BEGIN { RS="%" } # Set RS /Anytown/ { print $0 }' # Print records matching "Anytown" You must be very careful if you reset the record separator. If the Newline character is no longer the record separator, any Newlines will be a part of the record. Unless the field separator is an ERE which allows a Newline, it will also be part of one of the fields. You will seldom need to reset the record separator. Although the function Worked exampleWrite an Awk script to read in the password file and display users' names and home directories, in the following format: Chris Cringle has home directory /home/ugrad/chris. Sam Smith has home directory /home/ugrad/sam. ... Solution: Use has home directory and the output record separator to Newline. awk ' BEGIN { FS=":" OFS=" has home directory " ORS="\n" } { print $5,$6 }' </etc/passwd |
Copyright © 2002 Mike Joy, Stephen Jarvis and Michael Luck