Changes

GotCloud: GenomeSTRiP Pipeline (view source)

Revision as of 15:03, 10 February 2015

700 bytes added , 15:03, 10 February 2015

→‎Running GotCloud/GenomeSTRiP

Line 52: Line 52:

== Running GotCloud/GenomeSTRiP ==

+

The general command-line for running GenomeSTRiP via GotCloud is:

+

gotcloud genomestrip --run-<step> --conf <gotcloud.conf> --outdir <outputDirectory> --numjobs <#>

+

Where:

+

* <code>--run-<step></code> indicates which pipeline to run. Options are:

+

** <code>--run-metadata</code> - [[#Metadata Pipeline|Metadata Pipeline]]

+

** <code>--run-discovery</code> - [[#Discovery Pipeline|Discovery Pipeline]]

+

** <code>--run-genotype</code> - [[#Genotyping Pipeline|Genotyping Pipeline]]

+

** <code>--run-thirdparty</code> - [[#3rd-party Site Genotyping/Filtering Pipeline|3rd-party Site Genotyping/Filtering Pipeline]]

+

* <code>--conf <gotcloud.conf></code> - points to the configuration file to use

+

* <code>--outdir <outputDirectory></code> - tells GotCloud where to write the output

+

* <code>--numjobs <#></code> - number of jobs to run in parallel

+

Optional Parameters:

+

* <code>--metadata <metadataDirectory></code> - points to a directory containing pre-made metadata files

+

** Only required if skipping the <code>--run-metadata</code> step.

=== Metadata Pipeline ===

Line 59: Line 74:

NOTE: You don't always have to create the metadata on your own. You can in principle use the public metadata generated for 1000G samples, under the assumption that the metadata share similar characteristics to your samples. But if you have enough computing resources, the best practice is to create metadata specifically for your sequence data.

−

~~Command-line to run the metadata step:~~

−

~~gotcloud genomestrip --run-metadata --conf gotcloud.conf --outdir outputDirectory --numjobs 10~~

Timing:

Line 70: Line 82:

=== Discovery Pipeline ===

The discovery pipeline performs variant discovery across all samples as well as variant filtering based on expert knowledge.

−

~~gotcloud genomestrip --run-discovery --conf gotcloud.conf --outdir outputDirectory --numjobs 10~~

Timing:

Line 79: Line 89:

The genotyping pipeline iterates the discovered variants across the samples, calculating the genotype likelihood for each possible genotype.

−

~~gotcloud genomestrip --run-genotype --conf gotcloud.conf --outdir outputDirectory --numjobs 10~~

Timing:

* 10 BAMs, chr 21 and 22: 4 mins with 10 jobs

Mktrost

Administrators

3,045

edits

Changes

GotCloud: GenomeSTRiP Pipeline (view source)

Revision as of 15:03, 10 February 2015

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools