Qsub

From HPC

Revision as of 12:08, 29 April 2009 by Aitsswhi (Talk | contribs)
Jump to: navigation, search


qsub is the part of the Sun Grid Engine that allows you to submit you work to the cluster job queue. There are many command options for it i.e. setting up the enviroment, job name, array jobs, email alerts etc.

The two import switches to use are -cwd (makes the job start from the current working directory) and -V (makes the job run with the current enviroment variables)

qsub options program

Example

qsub -cwd -V myprog

Contents

Options

-V (set enviroment variables to those when job is submitted)
-cwd (run the job from current working directory)
-N name (sets name of job) 
-M email@address (set email address for job notifaction)
-m options (when to send email notification)
   b (begining of job)
   e (end of job)
   a (abortion or rescheduling of job)
   n (never email)
   s (suspention of job)
   example -m ase 
-t start-end:step Array job  (i.e. 1-5), :step is optional and is the step increment (1-6:2 would be 1 2 4 6). Environment variable $SGE_TASK_ID hold current position.

Embedding qsub options in scripts


NOTE If you write your job script on a windows machine and copy it to the cluster you must convert it using the dos2unix command i.e. dos2unix myjobscript


You can embed qsub options in to the scripts you use to run your programs on the cluster. They take the same overall form as if submitted via the command line, but each line must be prefixed with #$

Examples

#!/bin/bash
myprogram
qsub -V -cwd myscript

Is the same as

#!/bin/bash
#$ -V -cwd
myprogram
qsub myscript

You can also split the option over multiple lines

#!/bin/bash
#$ -N MYHPCJOB
#$ -V -cwd
myprogram

Email notification

You can get an email notification for changes to the status of your jobs. The -N email@address option sets the the address to be emailed (you can specify multiple, they just need to be seperated by commas)

qsub -N me@lshtm.ac.uk myscript

By default no changes to status will result in email notifications, you need to additionally specify what statuses you want to be notified about using the -m option.

These are the various notification options

b (begining of job)
e (end of job)
a (abortion or rescheduling of job)
n (never email - Default)
s (suspention of job)

So to be emailed when the job finishes or is aborted/rescheduled you would specify the following

qsub -M me@lshtm.ac.uk -m ea myscript

For all notifications you would do

qsub -M me@lshtm.ac.uk -m eabs myscript

In a script:

#!/bin/bash
#$ -N MYHPCJOB
#$ -M me@lshtm.ac.uk -m eabs
#$ -V -cwd
myprogram

Example notification on a finished job

Job 5013 (sim) Complete
 User             = train
 Queue            = serial.q@comp00.gecko.lshtm.ac.uk
 Host             = comp00.gecko.lshtm.ac.uk
 Start Time       = 10/14/2008 16:53:08
 End Time         = 10/14/2008 20:22:20
 User Time        = 02:55:55
 System Time      = 00:30:35
 Wallclock Time   = 03:29:12
 CPU              = 03:26:30
 Max vmem         = 763.133M
 Exit Status      = 0

Array Jobs

You can submit one job with multiple tasks using the option qstat option -t. This is an ideal method if you have a single program to run and multiple separate datasets to be processed. You can however use it in much more complicated setups, especially when combined with job dependancies.

When an array job is submitted an environment variable $SGE_TASK_ID is populated with current position in the array. So the value of this variable will change each time the job scheduler steps through the array.

The $SGE_TASK_ID variable can either be used in the script you create to submit your job or from within your application (i.e. within the C, Java, R code).

Example array job scripts

Simple 10 iteration loop (-t 1-10)

#!/bin/bash
#$ -N ARRAY_TEST_JOB
#$ -cwd -V
#$ -t 1-10

myProgram dataset.${SGE_TASK_ID}.dat

This example would submit 10 tasks to the job queue, the effective output would be:

myProgram dataset.1.dat
myProgram dataset.2.dat
myProgram dataset.3.dat
myProgram dataset.4.dat
myProgram dataset.5.dat
myProgram dataset.6.dat
myProgram dataset.7.dat 
myProgram dataset.8.dat 
myProgram dataset.9.dat 
myProgram dataset.10.dat

Simple 12 in steps of 2 (-t 2-12:2)

#!/bin/bash
#$ -N ARRAY_TEST_JOB
#$ -cwd -V
#$ -t 2-12:2

myProgram dataset.${SGE_TASK_ID}.dat

This example would submit 6 task to the job queue, the effective output would be:

myProgram dataset.2.dat
myProgram dataset.4.dat
myProgram dataset.6.dat
myProgram dataset.8.dat
myProgram dataset.10.dat
myProgram dataset.12.dat

Job Dependencies

You can specify that your job will not run until another job has completed

qsub -hold_jid <jobids> myscript

Examples

qsub -hold_jid 5204 myscript
qsub -hold_jib 5230,5236,5302 myscript
Personal tools