Stata

From HPC

Revision as of 14:57, 21 October 2008 by Aitsswhi (Talk | contribs)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)
Jump to: navigation, search

To run a job on the cluster requires you first to setup and create you Stata do files on your workstation. Once you have finished you upload the the files via SFTP (see [Accessing the Cluster]) to you home directory. It is highly recommend that you create a new directory for each stata job you plan to run. You then login in to the cluster via ssh/putty and submit your stata do file via a job script to the job queue.

When creating you Stata do files, it is worth thinking about braking you work up into multiple do files if possible. This means when you come to run them on the cluster you can run multiple simultaneously. For example you may have just one do file that would take 20 hours to process, if you chopped it up into 10 do files, it would take just 2hrs. It is also worth doing so that if for some reason the cluster has problems (node dies or crashes) and needs to re-run your job(s), that it will not take too long.

Once you have uploaded you do files(s) you will need a script to submit the job to the cluster

Single do file script

In a text editor create the following script in the same directory as your do file. For this example the scripts filename is myscript, but you can call it anything you like.

#!/bin/bash
#$ -N JOB_NAME
#$ -cwd -V
stata-se  -b do filename_of_do

So if you had a do file called mywork.do

#!/bin/bash
#$ -N JOB_NAME
#$ -cwd -V
stata-se -b do mywork

Then to submit your job

qsub myscript

All you output files should be generated in the directory you submitted the job from.

Multiple do files script=

If you have chopped your stata do file up to create multiple, you will need to name them the same, but with a sequence number prepended or appended. For example

1mywork.do
2mywork.do
3mywork.do

or

mywork1.do
mywork2.do
mywork3.do

Then in a text editor create the following script in the same directory as your do file. For this example the scripts filename is myscript, but you can call it anything you like. This example assumes you used the mywork1.do, mywork2.do naming.

#!/bin/bash
#$ -N ARRAY_TEST_JOB
#$ -cwd -V
#$ -t 1-10
stata-se -b do mywork${SGE_TASK_ID}

This script as it stands will submit 10 do files, if you want to change it to say 20, change this line

#$ -t 1-10

To

#$ -t 1-20

Then submit the job

qsub myscript
Personal tools