'UNSW - Science

UNSW - Science - HPC

Generic Katana PBS Examples

Generic PBS Examples

Here are some generic example of PBS scripts which can then be modified to meet your requirements. You may also find a job script example for the specific application that you are using.

Before submitting your first batch PBS job you may want to start an interactive session to confirm that
your code works, the module command is being loaded correctly and you are referring to the right location for files. The following command will give you access to an interactive session with access to 1 core for up to 1 hour.

qsub -I -l nodes=1:ppn=1,vmem=8gb,walltime=1:00:00

Save your job script script in a file called myjob.pbs and then submit it using the command qsub myjob.pbs.

Single Core Jobs

The following PBS script requests 1 CPU core on 1 node for a job called my_prog that uses less than 1Gb of memory and runs for under 1 hour.

#!/bin/bash
 
#PBS -N BASIC
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
cd $PBS_O_WORKDIR
 
./my_prog

The next PBS script also requests 1 CPU core on 1 node for a job called my_prog that uses less than 1Gb of memory and runs for under 1 hour. The only difference is that this job script sends an email when the job is complete or has an error.

#!/bin/bash
 
#PBS -N BASIC
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
#PBS -M fred.bloggs@unsw.edu.au
#PBS -m ae
 
cd $PBS_O_WORKDIR
 
./my_prog

This script is the same except that the output and error information is saved in a file called /home/z1234567/results/Output_Report.

#!/bin/bash
 
#PBS -N BASIC
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
#PBS -o /home/z1234567/results/Output_Report
#PBS -M fred.bloggs@unsw.edu.au
#PBS -m ae
 
cd $PBS_O_WORKDIR
 
./my_prog

Then rather than joining the output and error files we can save them both to the directory /home/z1234567/results.

#!/bin/bash
 
#PBS -N BASIC
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -o /home/z1234567/results/Output_Report
#PBS -o /home/z1234567/results/Error_Report
#PBS -M fred.bloggs@unsw.edu.au
#PBS -m ae
 
cd $PBS_O_WORKDIR
 
./my_prog

The examples above assume that you load any modules that you require in your .bashrc files. You can also load the modules you require in the PBS script after clearing any currently loaded modules. This is also useful if you normally use one version of a piece of software but would like to use a different version for a particular job.

#!/bin/bash
 
#PBS -N ENVIRONMENT
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
#PBS -M fred.bloggs@unsw.edu.au
#PBS -m ae
 
cd $PBS_O_WORKDIR
 
module purge
module add R/3.0.1
module add perl/5.18.0
 
./my_prog

Multiple Core Jobs

If your job has calculations that. These

In the first example the only difference with the scripts above is that the request is for 12 cores on a single node using up to 48Gb of memory.

#!/bin/bash
 
#PBS -N SMP
#PBS -l nodes=1:ppn=12
#PBS -l vmem=48gb
#PBS -l walltime=1:00:00
#PBS -j oe
#PBS -M fred.bloggs@unsw.edu.au
#PBS -m ae
 
cd $PBS_O_WORKDIR
 
./my_multicore_prog

Our next script is an example of how to call a MPI program which can then run across multiple nodes. This example requests 48Gb of memory across 2 compute nodes with a total of 12 cores.

#!/bin/bash
 
#PBS -N MPI
#PBS -l nodes=2:ppn=12
#PBS -l vmem=48gb
#PBS -l walltime=1:00:00
#PBS -j oe
#PBS -M fred.bloggs@unsw.edu.au
#PBS -m ae
 
cd $PBS_O_WORKDIR
 
mpiexec ./my_mpi_prog

Using Local Scratch

#!/bin/bash
 
#PBS -N SCRATCH
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -l file=50gb
#PBS -j oe
#PBS -M fred.bloggs@unsw.edu.au
#PBS -m ae
 
cd $PBS_O_WORKDIR
 
cp input $TMPDIR
 
cd $TMPDIR
 
split -d -b 512m input output
cp output* $PBS_O_WORKDIR

You can also use prologue and epilogue scripts to copy files to and from local scratch. For more details look at the Local Scratch page. The following example shows how to call the prologue and epilogue scripts in your home drive from your job script.

#!/bin/bash
 
#PBS -N SCRATCH
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -l file=50gb
#PBS -j oe
 
#PBS -l prologue=/srv/scratch/z1234567/prologue
#PBS -l epilogue=/srv/scratch/z1234567/epilogue
 
cd $PBS_O_WORKDIR
 
./my_prog

PBS Array Job Examples

If you want to run the same program with a collection of different input or data files then array jobs will make your life easier.

The first example will launch 100 (1 core jobs using up to 1 Gb of memory for up to 1 hour) jobs using input files 1.dat, 2.dat,...,100.dat. You will need to make sure that your program saves the output in a different location for each job.

#!/bin/bash
 
#PBS -N ARRAY
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 1-100
 
cd $PBS_O_WORKDIR
 
./my_prog ${PBS_ARRAYID}.dat

If, instead of using parameters, you modify your program each time you submit a job the next example shows how you can use array jobs to submit 100 different programs named my_prog_1, my_prog_2,..., my_prog_100.

#!/bin/bash
 
#PBS -N ARRAY2
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 1-100
 
cd $PBS_O_WORKDIR
 
./my_prog_$PBS_ARRAYID

The next example shows how you can use arrays to pass non-integer values through to your program.

#!/bin/bash
 
#PBS -N ARRAY3
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 0-3
 
cd $PBS_O_WORKDIR
 
PARAMS=('red' 'green' 'blue' 'yellow')
 
echo ${PARAMS[$PBS_ARRAYID]}

You can also use arrays when you have multiple collections of programs and data files each in their own directory as the following example using directories JOB_1, JOB_2,..., JOB_12.

#!/bin/bash
 
#PBS -N ARRAY4
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 1-12
 
cd $PBS_O_WORKDIR
 
cd JOB_$PBS_ARRAYID
 
./my_prog

One problem that frequently occurs when people use array jobs is that they want to run 250 jobs but each individual job only takes 10 minutes. As we don't want to run jobs that take less than one hour we can set up an array job which starts 25 separate jobs each of which completes 10 computations. We are still working through the numbers 1 to 250 but in groups of 10 if we use the job script below.

#!/bin/bash
 
#PBS -N GROUPED_ARRAY
#PBS -l nodes=1:ppn=1
#PBS -l vmem=4gb
#PBS -l walltime=12:00:00
#PBS -j oe
#PBS -t 1-25
 
cd $PBS_O_WORKDIR
 
 
sleep $[($RANDOM % 240) + 1]
 
((START_NUMBER = ($PBS_ARRAYID - 1 ) * 10 + 1))
((END_NUMBER = $START_NUMBER + 9))
 
for key in `(seq $START_NUMBER $END_NUMBER)`; do ./my_prog $key; done;

If we have 1,000 compute tasks that we want to group in lots of 20 then we can modify this job script to get the following job script.

#!/bin/bash
 
#PBS -N GROUPED_ARRAY
#PBS -l nodes=1:ppn=1
#PBS -l vmem=8gb
#PBS -l walltime=12:00:00
#PBS -j oe
#PBS -t 1-50
 
cd $PBS_O_WORKDIR
 
 
sleep $[($RANDOM % 240) + 1]
 
((START_NUMBER = ($PBS_ARRAYID - 1 ) * 20 + 1))
((END_NUMBER = $START_NUMBER + 19))
 
for key in `(seq $START_NUMBER $END_NUMBER)`; do ./my_prog $key; done;

You can also use the following method to achieve the same result. Here we have 8,000 different calculations that we want to run so we will split them up into 100 jobs each of which does 80 calculations.

#!/bin/bash
 
#PBS -N GROUPED_ARRAY
#PBS -l nodes=1:ppn=1
#PBS -l vmem=8gb
#PBS -l walltime=12:00:00
#PBS -j oe
#PBS -t 0-99
 
cd $PBS_O_WORKDIR
 
MAX_TASKS=80
 
for ((TASK=0; TASK < MAX_TASKS; ++TASK)); do
       INDEX=$((PBS_ARRAYID*MAX_TASKS + TASK))
       ./hardsums.py data${INDEX}.csv
done

If you have 2 independent parameters that you want to cycle through then there is a number of different ways to do it. The simplest way is to create an array job and then use the BASH command line to submit multiple array jobs. For example if you have data files red_1, ..., red_12, green_1, ..., green_12, blue_1, ..., blue_12, yellow_1, ..., yellow_12

for MY_VAR in red green blue yellow; do export $MY_VAR; qsub array.pbs; done;

where the following file is called array.pbs.

#!/bin/bash
 
#PBS -N ARRAY4 - $MY_VAR
#PBS -l nodes=1:ppn=1
#PBS -l vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe
 
#PBS -t 1-12
 
cd $HOME
 
./my_prog ${MY_VAR}_${PBS_ARRAYID}