'UNSW - Science

UNSW - Science - HPC

Frequently Asked Questions about Submitting Jobs

The basics of job submission can be found under HPC Basics in the HPC Jobs section.

I used the module command but it still can't find the application that I am trying to use.

If you want your job to access an application via the module command you will need to modify your .bashrc so that it is automatically loaded by your job script. An easy way to check is to submit an interactive job (using qsub -I) and then run module list to see what modules are available to your job.

I put my files in my home drive (H-drive) but I can't seem to get my job to run.

The likely answer is that your files need to be in your cluster home drive and not your H-drive as your H-drive is only available on the head node and not the compute nodes. Have a look at the page storage page for a discussion about the different storage locations and the copying files page for information about copying files to your cluster home drive.

Can I change the job script after it has been submitted?

Yes you increase the resource values for queued jobs, but even then you are constrained by the limits of the particular queue that you are submitting to. Once it has been assigned to a node the intricacies of the scheduling policy means that it becomes impossible for anyone including the administrator to make any further changes

Where does Standard Output (STDOUT) go when a job is run?

By default Standard Output is redirected to storage on the node and then transferred when the job is completed. If you are generating data you should redirect STDOUT to a different location. The best location depends on the characteristics of your job but in general all STDOUT should be redirected to local scratch.

How do I figure out what the resource requirements of my job are?

The best way to determine what the resource requirements of your job is to run it for the first time whilst being generous with the resource requirements and then refine the requirements based on what the job actually used. If you put the following information in your job script you will receive an email when the job finishes which will include a summary of the resources used.

#PBS -M email@unsw.edu.au 
 
#PBS -m ae

How many cores should I request?

Ask Martin and he will tell you.

Can I cause problems to other users if I request too many resources or make a mistake with my job script?

No.

Will a job script from another cluster work on cluster X?

It depends. Some aspects are fairly common across different clusters (e.g. walltime) others are not (e.g. select is on Tensor but not on Katana). You should look at the cluster specific information to see what queuing system is being used on that cluster and what commands you will need to change.

How can I see exactly what resources (I/O, CPU, memory and scratch) my job is currently using?

If you run

qstat -nru $USER

then you can see a list of your running jobs and where they are running. You can then use ssh to log on to the individual nodes and run top or dtop to see the load on the node including memory usage for each of the processes on the node. For more detailed information on the resources that your job is using, visit the page on job profiling.

What is the difference between virtual memory (VMEM or VSZ) and physical memory (MEM or RSZ)?

Physical memory is the memory storage that is located on the physical memory sticks in the server. Swap is the memory storage that is located on the disk. Virtual memory is the entire addressable memory space combining both physical and swap memory.

Why is VMEM so large?

With a recent update to glibc (which is used by almost every piece of software on the system) the way that virtual memory is allocated has changed. For performance reasons (to reduce the time spent waiting for memory allocation locks) virtual memory is now set aside for each thread. This means, for example, that a 400mb job with 16 threads may require 1024mb of virtual memory equating to 64mb per thread.

Depending on your job you may want to either increase your VMEM request or revert to something close to the previous behaviour depending on which provides your specific job better performance using:

export MALLOC_ARENA_MAX=1

How do I choose which version of software I use?

To select a specific version of a piece of software you can use the module command. This allow you to choose between different installed versions of software.

How do I request the installation or upgrade of a piece of software ?

If you wish to have a new piece of software installed or software that is already installed upgraded please send an email to the UNSW IT Servicedesk (servicedesk@unsw.edu.au) from your UNSW email account with details of what software change you require and the cluster that you would like it changed on.

Why is my job stuck in the queue whilst other jobs run?

The queues are not set up to be first-in-first-out. In fact all of the queued jobs sit in one big pool of jobs that are ready to run. The scheduler assigns priorities to jobs in the pool and the job with the highest priority is the next one to run. The length of time spent waiting in the pool is just one of several factors that are used to determine priority.

For example, people who have used the cluster heavily over the last two weeks receive a negative contribution to their jobs' priority, whereas a light user will receive a positive contribution. You can see this in action with the diagnose -p and diagnose -f commands.

You mentioned waiting time as a factor, what else affects the job priority?

The following three factors combine to generate the job priority.

  1. How many resources (cpu and memory) have you and your group consumed in the last 14 days? Your personal consumption is weighted more highly than your group's consumption. Heavy recent usage contributes a negative priority. Light recent usage contributes a positive priority.
  2. How many resources does the job require? Always a positive contribution to priority, but increases linearly with the amount of cpu and memory requested, i.e. we like big jobs.
  3. How long has the job been waiting in the queue? Always a positive contribution to priority, but increases linearly with the amount of time your job has been waiting in the queue. Note that throttling policies will prevent some jobs from being considered for scheduling, in which case their clock does not start ticking until that throttling constraint is lifted.

What happens if my job uses more memory than I requested?

If your job uses more memory than you requested then the carefully balanced node assignments decided by the job scheduler cease to be valid and it may cause issues on the active node. For example the extra memory you use becomes unavailable to another job that thinks that it is there to use so active memory will be swapped to the local disk and the node will slow to a crawl. To avoid this any job that uses more memory than requested will be terminated by the scheduler.

What happens if my job is still running when it reaches the end of the time that I have requested?

In order to ensure that everyone has equitable access to computational resources and to aid the efficient scheduling of jobs the cluster assigns to your job an amount of time matching your request. When that time is exhausted your job is automatically terminated.

200 hours is not long enough! What can I do?

If you find that your jobs take longer than the maximum WALL time then there are several different options to change your code so that it fits inside the parameters.

  • Can your job be split into several independent jobs?
  • Can you export the results to a file which can then be used as input for the next time the job is run?

You may want to also look to see if there is anything that you can do to make your code run better like making better use of local scratch if your code is I/O intensive.

Do sub-jobs within an array job run in parallel, or do they queue up serially?

Submitting an array job with 100 sub-jobs is equivalent to submitting 100 individual jobs. So if sufficient resources are available then all 100 sub-jobs could run in parallel. Otherwise some sub-jobs will run and other sub-jobs must wait in the queue for resources to become available.

The '%' option in the array request offers the ability to self impose a limit on the number of concurrently running sub-jobs. Also, if you need to impose an order on when the jobs are run then the 'depend' attribute can help.

In a pbs file does the VMEM requested refer to each node or the total memory on all nodes being used (if I am using more than 1 node?

VMEM refers to the amount of memory per node.