Changes

From Genome Analysis Wiki
Jump to navigationJump to search
no edit summary
Line 5: Line 5:       −
= Running the GotCloud Variant Calling Pipeline =
+
== Running the GotCloud Variant Calling Pipeline ==
    
The variant calling pipeline (umake) is run using <code>gotcloud snpcall</code> and <code>gotcloud ldrefine</code>.  
 
The variant calling pipeline (umake) is run using <code>gotcloud snpcall</code> and <code>gotcloud ldrefine</code>.  
   −
==Running the Automatic Test==
+
===Running the Automatic Test===
    
The automatic test runs the variant calling pipeline on a small test set and checks the results against expected results validating that GotCloud is installed correctly.
 
The automatic test runs the variant calling pipeline on a small test set and checks the results against expected results validating that GotCloud is installed correctly.
Line 22: Line 22:  
** If you see <code>Successfully ran the test case, congratulations!</code>, then you are ready to run ldrefine on your own samples.
 
** If you see <code>Successfully ran the test case, congratulations!</code>, then you are ready to run ldrefine on your own samples.
   −
= Overview of Variant Calling Pipeline Steps =
+
== Overview of Variant Calling Pipeline Steps ==
 
Here is an overview of the Variant Calling Pipeline:
 
Here is an overview of the Variant Calling Pipeline:
   Line 30: Line 30:  
For more information on the filters applied during the Variant Calling Pipeline, see, [[GotCloud: Filters]].
 
For more information on the filters applied during the Variant Calling Pipeline, see, [[GotCloud: Filters]].
   −
= Input Data=
+
== Input Data==
 
* [[#BAM Files|Aligned/Processed/Recalibrated BAM files]]
 
* [[#BAM Files|Aligned/Processed/Recalibrated BAM files]]
 
* [[#BAM List File|BAM list file containing Sample IDs & BAM file names]]
 
* [[#BAM List File|BAM list file containing Sample IDs & BAM file names]]
Line 36: Line 36:  
* (Optional) [[#Configuration File|Configuration file to override default options]]
 
* (Optional) [[#Configuration File|Configuration file to override default options]]
   −
== BAM Files ==
+
=== BAM Files ===
 
The BAM files need to be duplicate-marked and base-quality recalibrated in order to obtain high quality SNP calls. Generating these BAM files from original FASTQs is automatically done as part of the [[Alignment Pipeline]] of GotCloud.
 
The BAM files need to be duplicate-marked and base-quality recalibrated in order to obtain high quality SNP calls. Generating these BAM files from original FASTQs is automatically done as part of the [[Alignment Pipeline]] of GotCloud.
   −
== BAM List File ==
+
=== BAM List File ===
 
* Automatically created when running the GotCloud [[Alignment Pipeline]]
 
* Automatically created when running the GotCloud [[Alignment Pipeline]]
 
* Each line of the BAM list file represents a single individual
 
* Each line of the BAM list file represents a single individual
Line 62: Line 62:  
*** if all samples are from the same population, population label can be skipped or you can just specify <code>ALL</code> for the population label for each sample.
 
*** if all samples are from the same population, population label can be skipped or you can just specify <code>ALL</code> for the population label for each sample.
   −
== Reference Files ==
+
=== Reference Files ===
 
See [[GotCloud: Genetic Reference and Resource Files]] for detailed information about the multiple required reference files for the variant calling pipeline, including:
 
See [[GotCloud: Genetic Reference and Resource Files]] for detailed information about the multiple required reference files for the variant calling pipeline, including:
 
* How to obtain default references
 
* How to obtain default references
Line 76: Line 76:  
* [[GotCloud: Genetic Reference and Resource Files#INDEL VCF File(s)|INDEL VCF File(s)]]
 
* [[GotCloud: Genetic Reference and Resource Files#INDEL VCF File(s)|INDEL VCF File(s)]]
   −
== Configuration File ==
+
=== Configuration File ===
 
{{:GotCloud: Configuration}}
 
{{:GotCloud: Configuration}}
   −
===Additional Required User Config Files Settings===
+
====Additional Required User Config Files Settings====
 
The following Config File Settings must be specified by the user:
 
The following Config File Settings must be specified by the user:
 
* CHRS = space separated list of chromosomes you want
 
* CHRS = space separated list of chromosomes you want
 
* BAM_INDEX = path to the Index File of BAMs
 
* BAM_INDEX = path to the Index File of BAMs
   −
===Targeted/Exome Sequencing Settings===
+
====Targeted/Exome Sequencing Settings====
 
If you are running Targeted/Exome Sequencing, the user should specify:
 
If you are running Targeted/Exome Sequencing, the user should specify:
 
* Write loci file when performing pileup
 
* Write loci file when performing pileup
Line 105: Line 105:  
** OFFSET_OFF_TARGET = 50
 
** OFFSET_OFF_TARGET = 50
   −
=== Chromosome X Calling ===
+
==== Chromosome X Calling ====
 
Making calls on the X chromosome requires the user to specifty a PED file with sex information.
 
Making calls on the X chromosome requires the user to specifty a PED file with sex information.
 
* PED_INDEX = pedfile.ped
 
* PED_INDEX = pedfile.ped
   −
== Example Configuration File ==
+
=== Example Configuration File ===
 
Example configuration file where reference files happen to be stored in /path/reference, and bam index file in path/freeze5
 
Example configuration file where reference files happen to be stored in /path/reference, and bam index file in path/freeze5
 
  CHRS = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
 
  CHRS = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Line 119: Line 119:  
  DBSNP_VCF = /path/reference/dbsnp_135.b37.sites.vcf.gz  ### dbSNP variants (requires tabix index file in same directory)
 
  DBSNP_VCF = /path/reference/dbsnp_135.b37.sites.vcf.gz  ### dbSNP variants (requires tabix index file in same directory)
   −
= Running =
+
== Running ==
    
Running variant calling is straightforward:
 
Running variant calling is straightforward:
Line 134: Line 134:       −
== Running on a Cluster ==
+
=== Running on a Cluster ===
 
To run on the Cluster, the following settings need to be added to the configuration file:
 
To run on the Cluster, the following settings need to be added to the configuration file:
 
  BATCH_TYPE = batch_type
 
  BATCH_TYPE = batch_type
Line 149: Line 149:     
Here's the same configuration file we used above but now made to run on a cluster computer with MOSIX.
 
Here's the same configuration file we used above but now made to run on a cluster computer with MOSIX.
  == Example Configuration File ==
+
  === Example Configuration File ===
 
  CHRS = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
 
  CHRS = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
 
  BAM_INDEX = /path/freeze5/freeze5.bam.index  
 
  BAM_INDEX = /path/freeze5/freeze5.bam.index  
Line 160: Line 160:  
  BATCH_OPTS = -j10,11,12,13    ### Specify available MOSIX compute nodes
 
  BATCH_OPTS = -j10,11,12,13    ### Specify available MOSIX compute nodes
   −
= Results =
+
== Results ==
    
If there is a failure, you should see a message like:  
 
If there is a failure, you should see a message like:  

Navigation menu