TFM:Shell Scripts

From ProgSoc Wiki

Jump to: navigation, search


Erotic Fantasy: /bin/sh Shell Scripts

Chris Keane, Murray Grant and Nicholas FitzRoy-Dale


/bin/sh. The Bourne Shell. The Bell Shell. The completely retarded pedantic bastard shell from hell[1]. Call it what you will, no matter what UNIX system you're using, it's bound to be there. So here's a brief section on how you can program it in its own built-in scripting language[2].

Believe it or not, sh scripts are really really useful things. The basic idea is that a shell script is a plain ASCII file, containing a sequence of commands you'd ordinarily type in on the command line. However, if the list of commands needed to perform some function is quite long or complex, it makes sense to type them once in a script and run the script each time after that (which is also a hell of a lot faster than typing 1000 lines every time[3]).

In addition to commands that you would type in at the command prompt, there are other features, such a flow control statements, usually only found in high level languages. This facility makes it easy to do perform New And Exciting And Powerful tasks.

Shell Programming and Debugging

Make sure you're running bash

Type "help" and press enter. If you're running bash you should get a long screen that begins by telling you you're running bash. If you get something else, start up bash by typing

$ bash

You can also make bash your login shell, if it isn't already; see elsewhere in this Manual.

Input a Program

A shell program is a straight ASCII file containing a series of shell commands, and is input by any means that one would input text into the UNIX system --- vi, cat, ed for normal people and emacs or nano for masochists.

Theoretically, a shell script should start with the magic incantation #!/bin/sh on the first line and hard up against the left margin. In practice, this is generally not required as most modern systems assume a program is a shell script if it is ASCII text and executable[4]. One thing to note is that you shouldn't have a hash (#) by itself on the first position of the first line, as this is a incantation to invoke the C-shell (csh) rather than the Hell Shell (sh).[5] Programming csh is not really discussed in this manual, for reasons given in "Getting to Know Your Shell".

Why /bin/sh instead of /bin/bash? Well, on UNIX, /bin/sh is guaranteed to exist, and for Linux systems it's basically bash[6]. However, on other systems, such as FreeBSD, /bin/sh actually is a completely separate and different shell. This will confuse the hell out of you and waste your time. Consider yourself warned.

To demonstrate a shell script, we'll input a program to displays the date and time. (Use vi, the ProgSoc editor of choice, to enter the code below. Then use the UNIX command cat to check everything was input correctly.)


Making the program executable

When a text editor creates a file in UNIX, it has permissions directly related to your umask. The umask default permission is read and write, but not execute. To make your shell script executable, add execute permission to your script file using chmod.

foo$ ls -l datescript
-rw-r--r--    1    chris    15    Jan  4  15:20 datescript
foo$ chmod +x datescript
foo$ ls -l datescript
-rwxr-xr-x    1    chris    15    Jan  4  15:20 datescript

Running the program

Once the script has been successfully entered and made executable, it can be run. There are two ways to execute a shell script. The first and easiest is to type in its name.

foo$ datescript
Mon Jan  4 15:21:56 EST 2011

An alternative is to run the script through /bin/sh as a direct interpreter.

foo$ sh datescript
Mon Jan  4 15:21:56 EST 2011


foo$ sh < datescript
Mon Jan  4 15:21:56 EST 2011

As a special case, you can run the script as part of this shell, rather than starting up a separate one. This is useful if your script sets environment variables, and you want them to be available to the shell you're using. Type:

foo$ . datescript
Mon Jan  4 15:21:56 EST 2011

Using sh has the same result, but datescript is treated as a simple ASCII file that sh reads, rather than a program in itself. This knowledge, combined with knowing that # is the "comment" symbol in bash (well I'm telling you now, OK?) might explain the #!/bin/sh thing - when it's reading in the file, sh just ignores the line, since it starts with a comment.

Debugging shell programs

/bin/sh provides facilities to trace the execution of shell scripts. These are the -x and -v flags to the /bin/sh program. Using the earlier example of datescript, we can apply the -x option to examine the steps passed through by the executing program.

foo% cat datescript
foo% sh -x datescript
+ date
Mon Jan  4 15:21:56 EST 2011

Before each command in the script is executed, it is printed with a preceding + symbol. The program code below (a program called datescript2) is executed as an example:

foo$ cat datescript2
echo foobie doobie
who | grep chris
echo `hostname`
foo$ datescript2
Mon Jan  4 15:22:30 EST 2011
foobie doobie
chris    console Jan  4 15:19
chris    ttyp0   Jan  4 15:20
chris    ttyp1   Jan  4 15:20
chris    ttyp2   Jan  4 15:20
foo% sh -x datescript2
+ date 
Mon Jan  4 15:22:30 EST 2011
+ echo foobie doobie
foobie doobie
+ who | grep chris
chris    console Jan  4 15:19
chris    ttyp0   Jan  4 15:20
chris    ttyp1   Jan  4 15:20
chris    ttyp2   Jan  4 15:20
+ echo foo

The last echo command comes out as "+ echo foo" rather than "{+ echo `hostname`}" because `hostname` is enclosed in backquotes, which have a special meaning to the shell and are explained a little further on.

If the -v option was used instead of the -x option, the debugging output from the execution would not be rewritten, that is, in the above example, "{+ echo `hostname`}" would have appeared instead of "+ echo foo".

Standard I/O And Redirection

Streams of Characters

Under UNIX all files, whether ASCII or binary, are treated as simple streams of bytes. There is no structure imposed on the stream other than that imposed by application programs reading data. As a result, all files may generally be treated the same for file redirection, etc.

File redirection

Because files are simple streams of characters, it becomes a trivial matter to read their contents into programs or for programs to write information to files. Overall, there are three facilities for information interchange. These are called "standard in" (stdin), "standard out" (stdout) and "standard error" (stderr). In addition, other files may be opened and closed by the shell script.


Under normal circumstances, stdin is the keyboard. This is generically how a shell script receives data input from the user and/or a file. Stdin can be redirected in two main ways:

< (arrowhead in) usage:

cmd < file

The arrowhead-in (less than) symbol used in the above fashion caused the contents of file to be input to the command cmd as if they had been typed on the keyboard. Normal keyboard activity is ignored.

<< (arrowhead up-'til) usage:

cmd << STRING

The arrowhead up-'til (double less than) symbol is used to divert stdin in a similar fashion to arrowhead-in. Execution finishes when STRING is reached; this can be any arbitary string of characters with meaning to you. e.g.

foo% mail chris@foo << EOF
Hi there chris! 
Foobie Doobie!


Usually, stdout is the screen to which output is printed. Similar to stdin, however, there are a number of ways to redirect this output to files etc.

> (arrowhead out) usage:

cmd > file

All stdout output from the execution of cmd is written to a new file called file. If file exists, it is overwritten with the output of cmd.

>> (arrowhead append) usage:

cmd >> file

Similar to "arrowhead out", arrowhead append causes all output from the execution of cmd to be written to a file called file. However, if this file already exists, the output is appended to the file rather than overwriting it.


Programs with internal errors will generally attempt to write error messages to stderr. Like stdout, this usually appears on the screen, but it is separate from stdout so error messages can be seen even if stdout is redirected. Alternatively, stderr can be redirected to the same file as stdout, or another file again.

2>    (file arrowhead out)
2>&1  (arrowhead and)


cmd 2> errfile
cmd > file 2>&1

To understand this, you need to know that internally the /bin/sh has special file descriptors for stdin, stdout and stderr. These are 0, 1 and 2 respectively. The "file arrowhead out" symbol places any output from the given descriptor---in this case stderr, descriptor 2--- into the named file. Note that

cmd 2>> errfile

is also legal if you wish to append to a file.

The "arrowhead ampersand" symbol assigns to file descriptor 2 (stderr) the same output file as file descriptor 1 (stdout).

other files

As mentioned above, stdin, stdout and stderr are assigned file descriptors 0, 1 and 2 respectively. On a typical UNIX system, individual processes can hold 20 files open simultaneously[7]. This leaves descriptors 3-19 available for use on an ad hoc basis. Files may be opened with the exec command. Usage:

exec 3> file3
exec 4< file4

and then used as described in the `stderr' section above. e.g:

cmd 1>&3            To output to file3
cmd 0<&4            To input from file4


The concept of pipes is similar to file redirection. But instead of the output of one command being placed in a file, it is connected to the input of another process (which may in turn have its output connected to another process). E.g. ls | sort takes output from ls and feeds it to sort. Result: a sorted directory list. It's almost like using a temporary file - ls >listing; sort <listing; rm listing - except that it doesn't use any diskspace, since the intermediate data is never stored anywhere permanent.

Pipes are discussed in the UNIX section of this manual.

Devices are files

One concept that may seem strange is that devices under UNIX are also treated as ordinary files at the shell level. As a result, you can use standard redirection techniques such as those described above to read from and write to devices. Since quite a lot of this type of behaviour is likely considered quite naughty, and will probably get you a good solid spanking, this topic won't be further expounded on. So don't blame me when you get caught sending banners to other logged on users. Whoops.

Useful UNIX Commands

When writing shell scripts, its useful to know what's in your toolkit. This section gives a brief description of the utilities that get the biggest workout in shell scripts. Each of the commands mentioned here have manual page entries---read them. (man command-name.)


Awk is big and powerful. It does lots of things. No-one really understands it, including (it is rumoured) the authors. However, for those initiated into its arcane ways, awk is a batch spreadsheet, a programming language, a calculator, a string match and replacer and other things that are more rarely used. There are whole manuals devoted to awk. Read one.


A Stream EDitor. Sed has many features, but perhaps the most useful is the Vi-like "s/something/something else/" command that does text replacement based on regular expressions. Sed works on a line-by-line basis, and can do some of the things that you might otherwise use Awk for if you can't be bothered learning Awk.


Bench calculator. bc is a really fabbo program that provides maths support in the shell. There's another maths program called expr, but it doesn't support floating point maths. bc can be used in interactive mode or as part of a script. For interactive,

foo% bc
2 + 3

for batch processing,

foo% echo '2 + 3' | bc

The result of the expression is written to stdout, and so can therefore be redirected, assigned to a variable or piped to another application.


Concatenate files. cat takes command line arguments of one or several files and prints them onto stdout one after another (thus concatenating them). If there are no files specified on the command line, cat reads stdin and writes it to stdout.

foo% cat File1
Hi! I'm file1!
foo% cat file2
Hi! I'm file2!
foo% cat file1 file2
Hi! I'm file1!
Hi! I'm file2!
foo% cat < file1
Hi! I'm file1!


Prints the current date and time, taken from the system. There are quite a lot of ways to format the date. The default format is along the lines of "Mon Jan 4 18:32:46 EST 2011". However, this may be modified with command line arguments (e.g. date +%y%m%d results in "100104")


Show every line that differs between two files. No output from diff tells you that the two files specified on the command line are identical. A companion program, patch, can apply the output of "diff" to a file. This is particularly useful for updating versions of programs, because the differences between two versions are usually a lot smaller than the complete new version, and so can be transmitted more quickly. diff supports a number of different formats; the default is a context diff which I find difficult to read. I prefer unified diff, which can be obtained using diff -u.


Echo arguments to stdout. echo simply prints whatever is on its command line to stdout. There are a number of options - the most used is the -n option that doesn't put a new-line at the end of the output, so the next output goes on the same line. However, if leading or trailing white-space is required, the required output must be enclosed in quotes or the white-space will be discarded.

foo$ echo hello
foo$ cat > /tmp/qwerty
echo -n "hello "
echo there
foo$ chmod +x /tmp/qwerty
foo$ /tmp/qwerty
hello there


No-one knows what grep stands for. Some theorise global search and replace.[8] grep searches for instances of strings in one or more files (if files are specified on the command line) or from stdin (if no files are specified). e.g.

foo$ grep tree /usr/dict/words

Protecting and Processing

Quoting and Rewriting

In the /bin/sh, there is a small subset of characters that are "reserved", that is, they have special meaning to the shell. For example, reserved characters may be >, <, | etc for file redirection and pipes and *, ?, [, -, ] for pattern matching. It would be difficult to use these reserved characters on the command line for some command, for example.

foo$ echo old -> new
foo$ ls -l new
-rw-r--r--    1    chris    6  Jan 28 16:08 new
foo% cat new
old -

The use of echo in the above fashion creates a new file and writes the string "old -" into it---the ">" character opens the file as stdout.

There are a number of ways to protect reserved characters so you get what you really want.

\ Bash (Back slash). Bash protects the character immediately following it from being executed by the shell.

foo$ echo old -\> new
old -> new

So what if you want to print out a bash? Putting it on the command line results in it protecting the following character, rather than being printed itself. The answer is to protect the bash with another bash.

foo$ echo old =\= new
old == new
foo$ echo old =\\= new
old =\= new

" Quotes (double quotes). Double quotes are used around strings containing any reserved character except back-quote (see below) and the variable indicator "$".

foo$ echo "old -> new"
old -> new
foo$ ls "*"
* not found

' Quote (forward quote). The single quote provides the greatest protection against rewriting and reserved characters. All reserved characters, including double quotes, are protected by single quotes.

` Back Quote (backwards quote). The backquote is a rather special symbol. In /bin/sh, unless the backquote is protected by one of the valid means above, the contents of the string contained between the backquotes will be executed and the output from that execution written in place of the backquoted string.

foo$ echo "You are currently logged into host `hostname`"
You are currently logged into host foo

Shell Variables

Positional Parameters

Most UNIX commands take command line arguments to modify their behaviour. For example, supplying the -l argument to the ls command results in a "long" listing of files in the current directory.

In /bin/sh, command line arguments are passed to shell script as shell variables. Shell variables are similar to variables in other higher-level languages, but do not need to be declared before use.

From inside the shell script, the command line arguments may be referenced by a number corresponding to their position on the command line. For example, the first command line argument would be $1, the second $2 etc.

foo$ cat >greet
echo hello $1
foo$ . greet Chris
hello Chris

The Shift Command

With positional parameters described above, it is possible to "shift" them down so that parameters already recognised and processed may be discarded and the next parameter takes its place, for example, $1 is discarded, $2 becomes $1, $3 becomes $2 etc.

foo$ cat greet
echo hello $1
echo hello $1
foo$ greet Chris Kylie
hello Chris
hello Kylie

The shift command is particularly useful for looping commands to process the entire command line painlessly.

Using shell variables

Shell variables are labels for string storage in /bin/sh. There is no distinction between variables that contain a single letter, multiple letters, or a numeric value. In /bin/sh, all variables are treated as containing zero or more characters.

Before a value is assigned to a variable, it exists but contains zero characters. Variables may be introduced and used at any stage of the shell script and there is no need (there is no facility) to declare them before use. Traditionally, shell variable names are in all upper case, but there is no technical reason except readability that this is the case.

Wherever they appear in a shell script, an instance of a variable name is replaced by that variable's contents. If a variable contains an executable command, that command will be run (if the variable name is in a position that makes this possible).

foo$ cat >greet
HELLO="Hello $1"
echo $HELLO
echo $HELLO
foo$ . greet Chris Kylie
Hello Chris
Hello Chris, Kylie

The Export Command

Out there in the great ether of things hanging around your account is the "environment". You can look at what's in your environment with the env command. You can add things to your environment using the export command from within your script.

Your environment is passed as a series of shell variables to all programs you run, that is, to all the child processes of the current process. It is impossible, however for a child process to change the environment of its parent process. If you export a variable from within your shell script, it is passed to all programs that you run from that shell script, but is not passed back to the process that originally ran the script. This is why you need to use the "dot-trick" (. greet instead of sh greet) discussed above to run your scripts as part of the current shell if you wish to change your environment.

Variables Set Automatically

These shell variables are automagically set by /bin/sh when a shell script is executed.

$* All command line arguments

foo$ cat >argv
echo $*
foo$ argv 1 2 3 4
1 2 3 4

$# The number of arguments on the command line.

foo$ cat >argc
echo $#
foo$ argc 1 2 3 4

$? Return status. When a process finishes execution, it normally returns a value indicating the success or failure of its function. Zero is normally used for success, any other number for failure.

$$ Process ID. The UNIX process ID for the current process. This is ideal for generating a temporary file, since no two processes have the same process ID at the same time. If you're working with temp files that might outlive the process that created them, you might want to have a look at the mktemp command instead.

foo$ cat mktmpf
echo $TMP
foo$ mktmpf

$- Options. Carries a list of shell and set options that are current.

$! Background ID. The variable contains the process ID of the last background process run by the current shell.

More About Variables

Because variables aren't declared, it's often difficult for /bin/sh to understand what you're really talking about. In addition, it is possible to provide alternate actions depending on whether a variable is set or not.

${VAR} This is an alternate way of referencing variables that clearly defines where the name of the variable begins and ends. Whereas

foo$ echo $HOMEiswheretheheartis

is horribly confusing for both humans and /bin/sh (which would look for a variable called HOMEiswheretheheartis),

foo$ echo ${HOME}iswheretheheartis

is much more intelligible.[9]

${VAR:-string} If ${VAR} has not been set, return the value of string, but do not set ${VAR}.

${VAR:=string} If ${VAR} has not been set, return the value of string, and set ${VAR} to contain the value of string.

${VAR:?string} Return the value of string as an error if variable has not been set.

foo$ cat >thingy
echo ${HOME:?"No HOME directory set"}
foo$ . thingy
sh: HOME: No HOME directory set

Variable Substitution Via Backquotes

As discussed previously, backquotes run the command they surround and rewrite the surrounded command with its output. This may be assigned to variables, as well.

foo$ cat >host
HOSTNAME="You are currently logged into `hostname`"
foo$ . host
You are currently logged into foo

Variable Substitution from Keyboard

On occasion, it is necessary to read input from the keyboard or stdin into a shell variable. This can be done with the use of the read command, which is built into /bin/sh.

foo$ cat >prompt
echo -n "What is your name? "
read NAME
echo Hello, $NAME
foo$ . prompt
What is your name? Arthur, King of the Britons
Hello, Arthur, King of the Britons

The read command will assign the string that is read from stdin to the variable given on its command line. If this works, read returns zero (success) status. If some fault occurs---the user types CTRL-D, or the end of file is reached if stdin is being redirected from a file---then read returns non-zero status.

Control Flow

/bin/sh has some flow control constructors built in, similar to high level languages, but slightly different to take account of the string nature of /bin/sh variables and the return status of programs. There are four available constructors: if, while, for and case.

if list1 then list2 elif list3 else list4 fi

The if construct executes the single command in list1. If the command returns a zero value (i.e. success), the commands in list2 are executed in sequence. If list1 returned a non-zero value, the the first element in list3 (if it exists) is executed; if this element returns a non-zero value, the remaining list3 commands are executed. If neither list1 nor the first element of list3 return a zero status, list4 is executed if it exists.

A very useful program to use as part of list1 is the test or [ program. Test performs a large number of functions such as string matching and file existence testing. man test gives you the full run-down.

foo$ cat >query 

echo -n "What is your name? "
read NAME
if [ "$NAME" = "Arthur" ]
    echo "Hello!"
    echo "Who the hell are you?"
foo$ . query
What is your name? Arthur
foo$ query
What is your name? Sir Galahad
Who the hell are you?
foo$ cat >s
echo -n "Hostname to ssh to? "
read HOST
if [ "`hostname`" = "$HOST" ]
    echo "You are already logged into $HOST!"
    echo "Swine!"
    echo "ssh to $HOST"
    ssh $HOST
foo$ . s
Hostname to ssh to? foo
You are already logged into foo
foo$ . s
Hostname to ssh to? biggles
ssh to biggles
wzdd@biggles's password:

In the above example, the [ argument is actually a program that lives in /usr/bin. It has the same functionality as the test program, but looks a lot nicer. Because it is a program, it requires the same spacing as would normally be expected.

The arguments to the [ program are "$NAME", =, "Arthur" and ]. $NAME is surrounded by doubles quotes in case the string that $NAME represents contains a space character, which would severely stuff up the command line arguments.

while list1 do list2 done

while will continue to execute both lists of commands until list1 fails. This is useful with the read command. read returns a zero value and an initialized shell variable upon success, and a non-zero value upon CTRL-D or end of file.

foo$ cat >foobie
while read FOOBIE
   echo $FOOBIE
foo$ . foobie
Repeat after me
Repeat after me
Stop it!
Stop it!

There is a rarely used construct similar to while --- the until construct. The difference is that both list1 and list2 are executed until list1 succeeds.

for VAR in word-list do command-list done

Unlike most high level languages, the for loop in /bin/sh does not increment a counter or whatever, but loops around placing each element of word-list into the shell variable VAR. Word-list may be normal shell pattern-match.

foo$ cat >thingy
for THINGY in thingy foobie wookly dwang
    echo "for $THINGY"
foo$ . thingy
for thingy
for foobie
for wookly
for dwang
foo$ cat >doobie
# Changes all files in the current directory
# ending with .out to .foo
# Start the for loop, substitute .out on the end of files
# with nothing when it gets stuffed into $FILE
for FILE in `ls *.out | sed "s/.out//"`
    mv ${FILE}.out ${FILE}.foo
    echo moved ${FILE}.out to ${FILE}.foo
    TALLY="`echo $TALLY + 1 | bc`"
echo $TALLY files renamed.
foo$ . doobie
moved goo.out to
moved swine.out to
moved thing.out to
3 files renamed.

case word in pattern) list1;; ...\ ;; esac

The case construct matches the supplied word against the list of possible patterns. It proceeds in order though the list of patterns supplied and executes the list associated with the first matching pattern. The patterns/lists are separated by a double semi-colon. The pattern may be a literal string which must match the supplied word exactly, or it may be a wildcard that will match a range of words.

foo$ cat >args

while [ -n $1 ]
    case $1 in
    -a)     OPTA=true
    -b)     shift; 
    -c|-d)     OPTCD=true
    *)    echo Option $1 unknown 1>&2
        exit 1

args is a program that accepts the command line arguments -a, -b, -c and -d. If argument -b is supplied, an additional argument ($BFILE) is expected.

Built-in Commands

/bin/sh has a number of built-in commands, some of which have been discussed previously. Other interesting ones may be looked up in the manual page for /bin/sh. (man sh.)

break, continue, cd, eval, exit, export, read, readonly, 
set, shift, test, trap, wait


Functions are an extremely useful part of shell scripts. They are similar to functions in any other language, in that they take arguments and can return a value. But just to make things that much more exciting, they do so with a complete set of sh subtleties and inconsistencies.

Declaration and Basic Use

A function is declared in an sh script like so:

function_name() {
  more commands
  yet more commands

And is called like this:


So far, it's nice and easy! A function becomes something like another command usable in an sh script. Remember that sh's idea of local and global variables are not the same as C's. An sh variable is accessible anywhere in the script. Even if it's declared outside a function, it can still be read and changed within the function.

Arguments and Return Values

Function arguments are not very obvious in their use from the man pages. Imagine each function is running within its own subshell (although they don't on my FreeBSD system). Function arguments are handled in the exact same way command line arguments are. $1 through to $9 all work, as does $*, but $0 still refers to the name of the shell script, not the name of the function. Return values are really simple, just return <number>. Note that you can only return numbers greater than or equal to zero. An example will demonstrate better:

# Example program for demonstrating functions 

  echo "Leaving $0 with return value $1"
  exit $1

  echo "Command line is: $LINE"

while [ "$1" != "" ]; do
  print_cmd_line $*

exit_func 0

bash$ . fred joe smith
Command line is: fred joe smith
Command line is: joe smith
Command line is: smith
Leaving with return value 0


Recursion within sh functions is possible. The man page states there is no limit to the depth of recursion bar the resources of the host machine. The following test I did on a Pentium MMX 200 managed to allocate 80MB of memory (about 65 of it swap) in 15 seconds:

foo() {


Some of the shell script example programs used in this chapter are based on similar examples in [MM86].

  1. Rather like houseplants, computer programs perform better if you talk earnestly to them.
  2. Why you would want to do this is left as an exercise, rather like why the author decided to call this chapter "Erotic Fantasy". Actually, perhaps you're better off not thinking about that.
  3. Let me know if you're writing 1000 line shell scripts - you need, aahh, re-education.
  4. That is, has the execute permission set
  5. UNIX is full of intuitive things like this.
  6. Actually, bash runs in a special "sh-compatible" mode when started this way, which I would love to discuss but can't because my brain is hurting already.
  7. This is, of course, configurable, but it's a safe number to work with.
  8. Obviously people who can't spell --- Ed.
  9. Trust us.
Personal tools