Main index

Introducing UNIX and Linux


Overview
Using filters
      Collating sequence
      Character classes
Character-to-character transformation
Selecting lines by content
      Regular expressions
      Basic regular expressions
      Extended regular expressions
      Grep
Stream editor
      Sed addresses
Splitting a file according to context
Choosing between the three filters
More on Vi
Summary
Exercises

Extended regular expressions

Basic regular expressions are sufficient for most purposes, but a more sophisticated form of regular expression is available known as an extended regular expression or ERE. There are a couple of principal extra features available using EREs that are unavailable to BREs. The symbol +, following a bracket expression (or single character or dot) indicates one or more consecutive occurrences of the expression, in the same way that * indicates zero or more. The symbol ?, in the same context, indicates zero or one occurrences of that expression, so

[[:alpha:]]+[[:digit:]]?

matches any string commencing with a letter, consisting only of letters, and terminated optionally by a single digit. Note that the ? is not to be confused with the ? in pattern matching. If two EREs are separated by a | (vertical bar), the result matches either of those two EREs. Parentheses may be used to group subexpressions together:

(xyz|ab)\.c

will match either xyz.c or ab.c, and no other string. If you need a parenthesis to be a matched character in an ERE you must escape it.

Worked example

Write an ERE to match any string consisting either of only upper-case letters or only lower-case letters.
Solution: As in the previous worked example, the expression will commence with ^ and end with $. By taking advantage of the symbol + a match for upper-case letters would be [[:upper:]]+ and [[:lower:]]+ for lower-case letters. A sequence of letters of the same case will be matched by ([[:upper:]]+|[[:lower:]]+)), and a solution is therefore:

^[[:lower:]]+|[[:upper:]]+$


Copyright © 2002 Mike Joy, Stephen Jarvis and Michael Luck