Changes

From Genome Analysis Wiki
Jump to: navigation, search

GotCloud: GenomeSTRiP Pipeline

700 bytes added, 15:03, 10 February 2015
Running GotCloud/GenomeSTRiP
== Running GotCloud/GenomeSTRiP ==
The general command-line for running GenomeSTRiP via GotCloud is:
gotcloud genomestrip --run-<step> --conf <gotcloud.conf> --outdir <outputDirectory> --numjobs <#>
Where:
* <code>--run-<step></code> indicates which pipeline to run. Options are:
** <code>--run-metadata</code> - [[#Metadata Pipeline|Metadata Pipeline]]
** <code>--run-discovery</code> - [[#Discovery Pipeline|Discovery Pipeline]]
** <code>--run-genotype</code> - [[#Genotyping Pipeline|Genotyping Pipeline]]
** <code>--run-thirdparty</code> - [[#3rd-party Site Genotyping/Filtering Pipeline|3rd-party Site Genotyping/Filtering Pipeline]]
* <code>--conf <gotcloud.conf></code> - points to the configuration file to use
* <code>--outdir <outputDirectory></code> - tells GotCloud where to write the output
* <code>--numjobs <#></code> - number of jobs to run in parallel
 
Optional Parameters:
* <code>--metadata <metadataDirectory></code> - points to a directory containing pre-made metadata files
** Only required if skipping the <code>--run-metadata</code> step.
=== Metadata Pipeline ===
NOTE: You don't always have to create the metadata on your own. You can in principle use the public metadata generated for 1000G samples, under the assumption that the metadata share similar characteristics to your samples. But if you have enough computing resources, the best practice is to create metadata specifically for your sequence data.
 
Command-line to run the metadata step:
gotcloud genomestrip --run-metadata --conf gotcloud.conf --outdir outputDirectory --numjobs 10
Timing:
=== Discovery Pipeline ===
The discovery pipeline performs variant discovery across all samples as well as variant filtering based on expert knowledge.
 
gotcloud genomestrip --run-discovery --conf gotcloud.conf --outdir outputDirectory --numjobs 10
Timing:
The genotyping pipeline iterates the discovered variants across the samples, calculating the genotype likelihood for each possible genotype.
gotcloud genomestrip --run-genotype --conf gotcloud.conf --outdir outputDirectory --numjobs 10
Timing:
* 10 BAMs, chr 21 and 22: 4 mins with 10 jobs

Navigation menu