'UNSW - Science

UNSW - Science - HPC

Array Jobs

One common use of computational clusters is for parametric sweeps. This involves running many instances of the same application but each with different input data. Manually creating and managing large numbers of such jobs would be quite tedious. However, Torque supports the concept of array jobs which greatly simplifies the process.

An array job is a single job script that spawns many almost identical sub-jobs. The only difference between the sub-jobs is an environment variable PBS_ARRAYID whose value uniquely identifies an individual sub-job. A regular job becomes an array job when it uses the -t flag to express the required range of values for PBS_ARRAYID. For example, the following script will spawn 100 sub-jobs. Each sub-job will require one cpu core, 1GB memory and 1 hour run-time, and it will execute the same application. However, a different input file will be passed to the application within each sub-job. The first sub-job will read input data from a file called 1.dat, the second sub-job will read input data from a file called 2.dat and so on.

#!/bin/bash
 
#PBS -l nodes=1:ppn=1,vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
#PBS -t 1-100
 
cd ${PBS_O_WORKDIR}
 
./myprogram ${PBS_ARRAYID}.dat

Note: If you use an array job to start more than one copy of a program then, depending on the application, you may run into problems as multiple nearly identical jobs start all at once. If this occurs you can simply add a random wait in your script by adding the following line in your script immediately before the line where the application is launched.

sleep $[($RANDOM % 240) + 1]

If you have 2 independent parameters that you want to cycle through then there is a number of different ways to do it. The simplest way is to create an array job and then use the BASH command line to submit multiple array jobs. For example if you have data files red_1, ..., red_12, green_1, ..., green_12, blue_1, ..., blue_12, yellow_1, ..., yellow_12

for MY_VAR in red green blue yellow; do export $MY_VAR; qsub array.pbs; done;

where the following file is called array.pbs. To make the variable MY_VAR usable within the job script we have added the line

#PBS -v MY_VAR

to the start of the job script below.

#!/bin/bash
 
#PBS -N ARRAY4 - $MY_VAR
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
#PBS -v MY_VAR
 
#PBS -t 1-12
 
cd $HOME
 
./my_prog ${MY_VAR}_${PBS_ARRAYID}

There are some more examples of array jobs including how to group your computations in an array job on the examples page.

PBS Array Job Examples

If you want to run the same program with a collection of different input or data files then array jobs will make your life easier.

The first example will launch 100 (1 core jobs using up to 1 Gb of memory for up to 1 hour) jobs using input files 1.dat, 2.dat,...,100.dat. You will need to make sure that your program saves the output in a different location for each job.

#!/bin/bash
 
#PBS -N ARRAY
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 1-100
 
cd $PBS_O_WORKDIR
 
./my_prog ${PBS_ARRAYID}.dat

If, instead of using parameters, you modify your program each time you submit a job the next example shows how you can use array jobs to submit 100 different programs named my_prog_1, my_prog_2,..., my_prog_100.

#!/bin/bash
 
#PBS -N ARRAY2
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 1-100
 
cd $PBS_O_WORKDIR
 
./my_prog_$PBS_ARRAYID

The next example shows how you can use arrays to pass non-integer values through to your program.

#!/bin/bash
 
#PBS -N ARRAY3
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 0-3
 
cd $PBS_O_WORKDIR
 
PARAMS=('red' 'green' 'blue' 'yellow')
 
echo ${PARAMS[$PBS_ARRAYID]}

You can also use arrays when you have multiple collections of programs and data files each in their own directory as the following example using directories JOB_1, JOB_2,..., JOB_12.

#!/bin/bash
 
#PBS -N ARRAY4
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 1-12
 
cd $PBS_O_WORKDIR
 
cd JOB_$PBS_ARRAYID
 
./my_prog

One problem that frequently occurs when people use array jobs is that they want to run 250 jobs but each individual job only takes 10 minutes. As we don't want to run jobs that take less than one hour we can set up an array job which starts 25 separate jobs each of which completes 10 computations. We are still working through the numbers 1 to 250 but in groups of 10 if we use the job script below.

#!/bin/bash
 
#PBS -N GROUPED_ARRAY
#PBS -l nodes=1:ppn=1
#PBS -l vmem=4gb
#PBS -l walltime=12:00:00
#PBS -j oe
#PBS -t 1-25
 
cd $PBS_O_WORKDIR
 
 
sleep $[($RANDOM % 240) + 1]
 
((START_NUMBER = ($PBS_ARRAYID - 1 ) * 10 + 1))
((END_NUMBER = $START_NUMBER + 9))
 
for key in `(seq $START_NUMBER $END_NUMBER)`; do ./my_prog $key; done;

If we have 1,000 compute tasks that we want to group in lots of 20 then we can modify this job script to get the following job script.

#!/bin/bash
 
#PBS -N GROUPED_ARRAY
#PBS -l nodes=1:ppn=1
#PBS -l vmem=8gb
#PBS -l walltime=12:00:00
#PBS -j oe
#PBS -t 1-50
 
cd $PBS_O_WORKDIR
 
 
sleep $[($RANDOM % 240) + 1]
 
((START_NUMBER = ($PBS_ARRAYID - 1 ) * 20 + 1))
((END_NUMBER = $START_NUMBER + 19))
 
for key in `(seq $START_NUMBER $END_NUMBER)`; do ./my_prog $key; done;

You can also use the following method to achieve the same result. Here we have 8,000 different calculations that we want to run so we will split them up into 100 jobs each of which does 80 calculations.

#!/bin/bash
 
#PBS -N GROUPED_ARRAY
#PBS -l nodes=1:ppn=1
#PBS -l vmem=8gb
#PBS -l walltime=12:00:00
#PBS -j oe
#PBS -t 0-99
 
cd $PBS_O_WORKDIR
 
MAX_TASKS=80
 
for ((TASK=0; TASK < MAX_TASKS; ++TASK)); do
       INDEX=$((PBS_ARRAYID*MAX_TASKS + TASK))
       ./hardsums.py data${INDEX}.csv
done

If you have 2 independent parameters that you want to cycle through then there is a number of different ways to do it. The simplest way is to create an array job and then use the BASH command line to submit multiple array jobs. For example if you have data files red_1, ..., red_12, green_1, ..., green_12, blue_1, ..., blue_12, yellow_1, ..., yellow_12

for MY_VAR in red green blue yellow; do export $MY_VAR; qsub array.pbs; done;

where the following file is called array.pbs.

#!/bin/bash
 
#PBS -N ARRAY4 - $MY_VAR
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 1-12
 
cd $HOME
 
./my_prog ${MY_VAR}_${PBS_ARRAYID}

Matlab Array Job Examples

When you want to submit multiple jobs that are the same except for different input data then Array Jobs are the way to go.

The following example will run 20 copies of Matlab with the environment variable PBS_ARRAYID being set to the values 1 through 20.

#!/bin/bash
 
#PBS -N ARRAY4
#PBS -l nodes=1:ppn=1
#PBS -l vmem=8gb
#PBS -l walltime=10:00:00
#PBS -j oe
 
#PBS -t 1-20
 
module add matlab/2015b
 
matlab –nodisplay -nojvm -singleCompThread -r myprog

We can then use the following Matlab code in a file called myprog and use the variable PBS_ARRAYID as a variable inside Matlab.

x = str2num(getenv(‘PBS_ARRAYID’))
fname=sprintf('x%d.txt',x)
save(fname,’x’)
quit()