This page documents how to perform variant calling from low-coverage sequencing data using glfmultiples and thunder. The pipeline was originally developed by Yun Li for the 1000 Genomes Low Coverage Pilot Project.
To get started, you will need glf files in the standard format glf format.
If you do not have glf files, you can generate them from bam files (bam format also specified in glf format bam format) using the following command line:
samtools pileup -g -T 1 -f ref.fa my.bam > my.glf
Note: you will need the reference fasta file ref.fa to create glf file from bam file.
How to Run
This variant calling pipeline has two steps. (step 1) promotion of a set of potential polymorphisms; and (step 2) genotype/haplotype calling using LD information.
(step 1) Site promotion using software glfMultiples GPT_Freq.
GPT_Freq -b my.out -p 0.9 --minDepth 10 --maxDepth 1000 --sitedepth *.glf