1. Computing

Discuss in my forum

Subshells in Bash Scripts

Parallel Processing

By

The basic interface for entering commands on a Linux system is a shell. It is a process that enables you to enter a command directly, or to specify a file ("script") that contains a sequence of commands to be executed. Shells are organized hierarchically: any shell can create a new shell, and the new shell is considered a child process of the creating (parent) shell.

By default a child process is dependent on its parent in the sense that if the parent process terminates, the child also terminates. And output (stdout and stderr) is passed back from the child to the parent process.

In a Bash shell script you create a subshell using the parenthesis notation:

#!/bin/bash
echo "Before starting subshell"
(
   count=1
   while [ $count -le 99 ]
   do
       echo "$count"
       sleep 1
       (( count++ ))
   done
)
echo "Finished"

In the above example the while loop is enclosed in parenthesis, which causes it to be executed in a subshell, which is a child process of the shell in which the script file is executed.

Without specifying that the subshell is to be executed in the background, the parent shell waits for the subshell to finish before continuing with rest of the script (following the subshell expression).

This means, if you want to run subshells in parallel you have to run them in the background, which is accomplished as usual with the "&" (ampersand) character following the subshell expression:

#!/bin/bash
echo "Before starting subshell"
(
   count=1
   while [ $count -le 99 ]
   do
       echo "$count"
       sleep 1
       (( count++ ))
   done
) &
echo "Finished"

So if you create multiple subshells as background processes, you can run tasks in parallel. In general, the operating system will use different processors (cores) for each process and subprocess, assuming there are at least as many processors/cores as there are processes. Otherwise tasks will be assigned to the same processors/cores. In that case the processor/core continuously switches between the assigned tasks (processes) until they are completed. For example:

#!/bin/bash
echo "Before starting subshell"
(
   count=1
   while [ $count -le 99 ]
   do
       echo "$count"
       sleep 1
       (( count++ ))
   done
) &
(
   count=1000
   while [ $count -le 1099 ]
   do
       echo "$count"
       sleep 1
       (( count++ ))
   done
) &
echo "Finished"

Now we have two subprocesses. The first one counts from 1 to 99, and the second one from 1000 to 1099.

You can use the wait statement to tell the parent process to wait for the subprocesses to finish before proceeding with the rest of the script:

#!/bin/bash
echo "Before starting subshell"
(
   count=1
   while [ $count -le 99 ]
   do
       echo "$count"
       sleep 1
       (( count++ ))
   done
) &
(
   count=1000
   while [ $count -le 1099 ]
   do
       echo "$count"
       sleep 1
       (( count++ ))
   done
) &
wait
echo "Finished"

Subshells are also useful when commands need to be executed in a particular environment (variable settings) or directory. If each command is executed in a different subshell there is no risk variable settings will be mixed up, and on completion the settings and the current directory don't need to be restored, as the environment of the parent process is not affected by any of its subprocesses.

Subshells can be used in function definitions so that they can be executed multiple times, with different parameters if needed.

  1. About.com
  2. Computing
  3. Linux
  4. Linux HowTos
  5. Bash How-To's
  6. Bash Subshells and Parallel Processing

©2014 About.com. All rights reserved.