'UNSW - Science

UNSW - Science - HPC

Advanced Batch Jobs

Using Local Scratch

A unique local scratch directory is automatically created for each job by the resource manager. This directory is created on the compute node's local disk and its location is recorded in the environment variable $TMPDIR. The local scratch directory is created when the job starts and it is removed when the job finishes. Consequently, any files generated within $TMPDIR must be transferred to more permanent storage if they are to be retained beyond the lifetime of the job.

Note: The capacity of local scratch is limited so using more than 100Gb in your job may be problematic.

#PBS -l nodes=1:ppn=1,vmem=1gb
#PBS -l walltime=1:00:00
#PBS -j oe

One way of copying data to and from local scratch is to use the prologue and epilogue scripts that run immediately before and after your job. These scripts can run for a maximum of 5 minutes and run even if the job runs out of time and are described in detail here.

In general, please note that directory names containing spaces can cause problems for the resource manager. For example, if the name of the working directory contains spaces then the resource manager will be unable to deliver job output files to that directory. Therefore it is recommended that directory names and file names should not contain any spaces.

Running a Script Before or After Your Compute Job

As mentioned above you can use scripts to copy files to and from local scratch but the prologue and epilogue scripts can also be used to run generic BASH scripts for any purpose including submitting new jobs, combining results and so on. Have a look at the examples used for local scratch here and use them as a basis for your scripts.

Job Dependencies

It is possible to chain jobs together so that a new one starts when the previous one finishes. There are instructions on how to do this on the chaining batch jobs page.