Structure (Software)

From HPC

Jump to: navigation, search

The program structure is a free software package for using multi-locus genotype data to investigate population structure. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. It can be applied to most of the commonly-used genetic markers, including SNPS, microsatellites, RFLPs and AFLPs.

http://pritch.bsd.uchicago.edu/structure.html

Contents

Command-line changes to parameter values

In order to simplify batch runs and make it easier to run simulations involving structure, we have added command-line flags that update the values of certain parameters, over-riding the values set in mainparams. These are as follows:

-m (mainparams) Read a different parameter input file instead of mainparams. 
-e (extraparams) Read a different parameter input file instead of extraparams. 
-s (stratparams) Read a different parameter input file instead of stratparams. (For use with the 
                   accompanying program, STRAT, for association mapping.) 
-K (MAXPOPS) Change the number of populations. 
-L (NUMLOCI) Change the number of loci. 
-N (NUMINDS) Change the number of individuals. 
-i (input file) Read data from a different input file. 
-o (output file) Print results to a different output file. 

Example

structure -K 5 -o output5 

Usage

You need to specify a main params file (can be created in the gui on you local workstation), an extra params file (this can be an empty text file, but must be specified), a data file and the location of the output

structure -m mainparams -e extraparams -i project_data -o results

Sample Job script

extraparms file is an empty text file ("touch extraparams" will create this file for you). This script will submit a job with 12 tasks. Each mainparams file is labeled mainparams.param1.k1 - mainparams.param1.k12. The results will be written to a directory called results and the files would for example be labeled results_8997_1 - results_8997_12, if the Job ID when submitted was 8997.

#!/bin/bash
#$ -N MyJobName
#$ -cwd -V
#$ -t 1-12

structure -m mainparams.param1.k${SGE_TASK_ID} -e extraparams -i project_data -o results/results_${JOB_ID}_${SGE_TASK_ID}
Personal tools