From Genome Analysis Wiki
Jump to navigationJump to search
489 bytes added
, 16:33, 12 March 2013
Line 16: |
Line 16: |
| === Discovery === | | === Discovery === |
| | | |
− | Discovery is performed at per sample level, the evidence sites lists for each sample is then merged and site discovery statistics are computed. | + | Discovery is performed at per sample level, the evidence sites lists for each sample are then merged and site discovery statistics are computed. |
− | The user then makes a decision on cut offs to make to create an initial site list. | + | The user then makes a decision on cut offs to make to create an initial candidate site list. |
| | | |
| Generates site list with info fields E and N. | | Generates site list with info fields E and N. |
Line 23: |
Line 23: |
| vt discover -i NA12878.bam -o NA12878.sites.vcf -g hs37d5.fa | | vt discover -i NA12878.bam -o NA12878.sites.vcf -g hs37d5.fa |
| | | |
− | Left align variants. | + | Left align variants. This is required as left alignment if insertions and/or deletions within a read is sometimes insufficient to ensure complete left alignment. |
| | | |
| vt left_align -i NA12878.bam -o NA12878.leftaligned.sites.vcf -g hs37d5.fa | | vt left_align -i NA12878.bam -o NA12878.leftaligned.sites.vcf -g hs37d5.fa |
| | | |
− | Evidence site lists are combined across samples and split by sites. | + | Evidence site lists are combined across samples and split by sites to allow for parallelization. |
| | | |
− | vt merge_and_split_sample_vcf -i NA12878.sites.vcf,NA12879.sites.vcf,NA12880.sites.vcf -o 1-1000000.sites.vcf | + | vt merge_and_split_sample_vcf -i NA12878.sites.vcf,NA12879.sites.vcf,NA12880.sites.vcf -l 5000 |
| | | |
− | Discovery statistics are computed. | + | Discovery statistics are computed. These statistics will allow you to choose a suitable cut off for creating a suitable candidate site list. |
| | | |
| vt compute_discovery_stats -i 1-1000000.sites.vcf -o 1-1000000.annotated.sites.vcf | | vt compute_discovery_stats -i 1-1000000.sites.vcf -o 1-1000000.annotated.sites.vcf |
| + | |
| + | Merge site lists. |
| + | |
| + | vt merge -i 1000000.sites.vcf,2000000.sites.vcf,3000000.sites.vcf -o candidate.sites.vcf |
| + | |
| + | Plot charts to help with candidate list selection criteria. |
| + | |
| + | vt plot_discovery -i candidate.sites.vcf |
| | | |
| | | |
| A calling pipeline implemented in a make file is available here. | | A calling pipeline implemented in a make file is available here. |
− |
| + | |
| === Genotyping === | | === Genotyping === |
| | | |