Difference between revisions of "Thunder"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 19: Line 19:
 
   GPT_Freq -b my.out -p 0.9 --minDepth 10 --maxDepth 1000 *.glf  
 
   GPT_Freq -b my.out -p 0.9 --minDepth 10 --maxDepth 1000 *.glf  
  
(step 2) Genotype/haplotype calling using thunder [https://www.sph.umich.edu/csg/yli/thunder.V009.source.tgz thunder_glf_freq]
+
(step 2) Genotype/haplotype calling using thunder [https://www.sph.umich.edu/csg/yli/thunder.V009.source.tgz thunder_glf_freq].
 +
 
 +
  thunder_glf_freq --shotgun my.out.$chr -r 100 --states 200 --dosage --phase --interim 25 -o my.final.out
  
 
Note: The program thunder used in step 2 is an extension of MaCH, the genotype imputation software we have previously developed. For details regarding the shared options, please check out [http://www.sph.umich.edu/csg/yli/mach/index.html MaCH website] and [http://genome.sph.umich.edu/wiki/Mach MaCH wiki].
 
Note: The program thunder used in step 2 is an extension of MaCH, the genotype imputation software we have previously developed. For details regarding the shared options, please check out [http://www.sph.umich.edu/csg/yli/mach/index.html MaCH website] and [http://genome.sph.umich.edu/wiki/Mach MaCH wiki].
 
== Important Filters ==
 
== Important Filters ==

Revision as of 15:29, 15 October 2010

This page documents how to perform variant calling from low-coverage sequencing data using glfmultiples and thunder. The pipeline was originally developed by Yun Li for the 1000 Genomes Low Coverage Pilot Project.

Input Data

To get started, you will need glf files in the standard format glf format. Sample files are available at sample glf files.

If you do not have glf files, you can generate them from bam files (bam format also specified in glf format bam format) using the following command line:

 samtools pileup -g -T 1 -f ref.fa my.bam > my.glf

Note: you will need the reference fasta file ref.fa to create glf file from bam file.

How to Run

This variant calling pipeline has two steps. (step 1) promotion of a set of potential polymorphisms; and (step 2) genotype/haplotype calling using LD information.

(step 1) Site promotion using software glfMultiples GPT_Freq.

 GPT_Freq -b my.out -p 0.9 --minDepth 10 --maxDepth 1000 *.glf 

(step 2) Genotype/haplotype calling using thunder thunder_glf_freq.

 thunder_glf_freq --shotgun my.out.$chr -r 100 --states 200 --dosage --phase --interim 25 -o my.final.out

Note: The program thunder used in step 2 is an extension of MaCH, the genotype imputation software we have previously developed. For details regarding the shared options, please check out MaCH website and MaCH wiki.

Important Filters