'UNSW - Science

UNSW - Science - HPC

Figuring Out When Your Job Will Start

Because of the fluid nature of the queuing system which recalculates priorities each time a job is added to the queue or finishes it is not always possible to determine when your job will start running on a compute node. However if your job has progressed in the queue to the point that it has starting reserving resources in readyness for your job running then looking at the StartTime figure will tell you when the job is currently expected to start and Start will tell you how far into the future that is.

showres -n JOBID

or

showres -n | grep JOBID

For a list of the jobs that are idle you can type

showq -i

This lists all the jobs that are waiting for resources to become available in order of their current priority. The jobs with an asterix next to them has resources already reserved and information will be available if you use the showres -n.

If you want to know why the jobs in the idle queue are in the order that they are then you can use the command

diagnose -p

which lists the components that make up the idle job priority.

Name Explanation
PRIORITY The final number used to determine job running order. The higher the number the sooner it will run.
FS User A number representing how much the user has used the cluster in the last 2 weeks weighted by group buy in.
FS Group A number representing how much the group that the user belongs to has used the cluster in the last 2 weeks weighted by group buy in.
Serv A number representing the amount of time that the job has been in the queue (QTime).
Res A number representing the resources that the job requires.

 

NOTE: Just because a job is at the top of the list does not mean that it will be the next job to run which also depends on what nodes the job can run on and other resource issues.

Note: If you want to specify the CPU then you should look at the Katana node list to see what nodes you have access to. If you request more than 12 hours of WALLTIME then you can only use the nodes bought by your school or research group, or the Faculty of Science. A long running job that specifies a CPU that you don't have access to will never start.

Note: If you are not part of the School of Mathematics and Statistics, UNSW Business School or the Climate Change Research Centre then a job requiring more than 128Gb of memory will only run if it has a WALLTIME of 12 hours or less based on node ownership.

Note: If possible break up your jobs so that they need less than 12 hours of run time. That way they will likely start sooner as they can run on any node.

Note: If you request a WALLTIME of greater than 200 hours then your job WILL NOT run unless you are a member of the Astrobiology group.