UMAKE-glfSingle

From Genome Analysis Wiki
Revision as of 15:49, 15 January 2014 by Yancylo (talk | contribs)
Jump to navigationJump to search

This is a modification of UMAKE to incorporate an individual-based variant caller in the pipeline.

The idea is to use glfSingle to generate sample-specific VCF after pileup, and then replace the glfMultiples step by a merging step. The merging generates a population VCF that looks the same as what would have been the glfMultiples output. Subsequent filtering and imputation steps can follow as usual.

Ingredients:

  • Index file - same as original UMAKE index. An example is at /net/wonderland/home/yancylo/bin/umake-glfSingle/umake-glfSingle.index
  • Configuration file - same as original UMAKE conf. An example is at /net/wonderland/home/yancylo/bin/umake-glfSingle/umake-glfSingle.conf . Note that the glfSingle and merging steps are implicitly included in these two steps:
 RUN_PILEUP = TRUE       # create GLF file from BAM then individual VCF using glfSingle
 RUN_GLFMULTIPLES = TRUE # create unfiltered SNP calls, population VCF by merging the glfSingle outputs
  • Perl script for generating Makefile - /net/wonderland/home/yancylo/bin/umake-glfSingle/umake-glfSingle.pl . It is modified from umake.pl:
    • calls glfSingle and merge_glfS_vcf.py (for merging across single-sample VCF) from /net/wonderland/home/yancylo/bin/umake-glfSingle
    • To generate the Makefile corresponding to this new pipeline flow, do:
 perl /net/wonderland/home/yancylo/bin/umake-glfSingle/umake-glfSingle.pl --conf /net/wonderland/home/yancylo/bin/umake-glfSingle/umake-glfSingle.conf

Customization:

  • To change paths to glfSingle and merge_glfS_vcf.py, go to the following lines of umake-glfSingle.pl:
    • line 1000 - my $cmd = "python [your-path]/merge_glfS_vcf.py --file-list $glfAlias --chr $chr --outfile $vcf > $vcf.log";
    • line 1073 - $cmd .= "\n\t".&getMosixCmd("[your-path]/glfSingle -g $smGlf -b $smVcf -l $allSMs[$i] --minMapQuality 0 --minDepth 1 --maxDepth 100000 --reference > $smVcf.log");
  • To apply the uniform Ts/Tv model to glfSingle, go to the following line of umake-glfSingle.pl and make the changes in bold:
    • line 1073 - $cmd .= "\n\t".&getMosixCmd("/net/wonderland/home/yancylo/bin/umake-glfSingle/glfSingle_ut -g $smGlf -b $smVcf -l $allSMs[$i] --minMapQuality 0 --minDepth 1 --maxDepth 100000 --reference --uniformTsTv > $smVcf.log");