Changes

From Genome Analysis Wiki
Jump to navigationJump to search
489 bytes added ,  16:33, 12 March 2013
Line 16: Line 16:  
=== Discovery ===
 
=== Discovery ===
   −
Discovery is performed at per sample level, the evidence sites lists for each sample is then merged and site discovery statistics are computed.
+
Discovery is performed at per sample level, the evidence sites lists for each sample are then merged and site discovery statistics are computed.
The user then makes a decision on cut offs to make to create an initial site list.
+
The user then makes a decision on cut offs to make to create an initial candidate site list.
    
Generates site list with info fields E and N.
 
Generates site list with info fields E and N.
Line 23: Line 23:  
  vt discover -i NA12878.bam -o NA12878.sites.vcf -g hs37d5.fa
 
  vt discover -i NA12878.bam -o NA12878.sites.vcf -g hs37d5.fa
   −
Left align variants.
+
Left align variants.  This is required as left alignment if insertions and/or deletions within a read is sometimes insufficient to ensure complete left alignment.
    
  vt left_align -i NA12878.bam -o NA12878.leftaligned.sites.vcf -g hs37d5.fa
 
  vt left_align -i NA12878.bam -o NA12878.leftaligned.sites.vcf -g hs37d5.fa
   −
Evidence site lists are combined across samples and split by sites.
+
Evidence site lists are combined across samples and split by sites to allow for parallelization.
   −
  vt merge_and_split_sample_vcf -i NA12878.sites.vcf,NA12879.sites.vcf,NA12880.sites.vcf -o 1-1000000.sites.vcf
+
  vt merge_and_split_sample_vcf -i NA12878.sites.vcf,NA12879.sites.vcf,NA12880.sites.vcf -l 5000
   −
Discovery statistics are computed.
+
Discovery statistics are computed.  These statistics will allow you to choose a suitable cut off for creating a suitable candidate site list.
    
  vt compute_discovery_stats -i 1-1000000.sites.vcf -o 1-1000000.annotated.sites.vcf  
 
  vt compute_discovery_stats -i 1-1000000.sites.vcf -o 1-1000000.annotated.sites.vcf  
 +
 +
Merge site lists.
 +
 +
vt merge -i 1000000.sites.vcf,2000000.sites.vcf,3000000.sites.vcf -o candidate.sites.vcf
 +
 +
Plot charts to help with candidate list selection criteria.
 +
 +
vt plot_discovery -i candidate.sites.vcf
       
A calling pipeline implemented in a make file is available here.
 
A calling pipeline implemented in a make file is available here.
+
 
 
=== Genotyping ===
 
=== Genotyping ===
  
1,102

edits

Navigation menu