From Genome Analysis Wiki
Jump to navigationJump to search
2,340 bytes added
, 11:31, 26 September 2014
Line 198: |
Line 198: |
| #normalize variants, send to standard out and remove duplicates. | | #normalize variants, send to standard out and remove duplicates. |
| vt normalize dbsnp.vcf -r seq.fa | vt mergedups - -o dbsnp.normalized.merged.vcf | | vt normalize dbsnp.vcf -r seq.fa | vt mergedups - -o dbsnp.normalized.merged.vcf |
| + | |
| + | #variants that are normalized will be annotated with an OLD_VARIANT info tag. |
| + | #CHROM POS ID REF ALT QUAL FILTER INFO |
| + | 19 29238772 . C G . PASS VT=SNP;OLD_VARIANT=19:29238771:TC/TG |
| + | 20 60674709 . GCCCAGCCCCAC G . PASS VT=INDEL;OLD_VARIANT=20:60674718:CACCCCAGCCCC/C |
| + | |
| + | #this shows a sample output with the normalization operations that were used |
| + | #categorized into 5 categories each for biallelic and multiallelic variants. <br> |
| + | stats: biallelic |
| + | no. left trimmed : 156908 |
| + | no. right trimmed : 323 |
| + | no. left and right trimmed : 33 |
| + | no. right trimmed and left aligned : 7 |
| + | no. left aligned : 12360 <br> |
| + | total no. biallelic normalized : 169631 <br> <br> |
| + | multiallelic |
| + | no. left trimmed : 627189 |
| + | no. right trimmed : 2509 |
| + | no. left and right trimmed : 1498 |
| + | no. right trimmed and left aligned : 212 |
| + | no. left aligned : 1783 <br> |
| + | total no. multiallelic normalized : 633191 <br> |
| + | total no. variants normalized : 802822 |
| + | total no. variants observed : 88052639 |
| + | |
| + | <div class="mw-collapsible-content"> |
| + | usage : vt normalize [options] <in.vcf> |
| + | |
| + | options : -o output VCF file [-] |
| + | -I file containing list of intervals [] |
| + | -i intervals [] |
| + | -r reference sequence fasta file [] |
| + | -- ignores the rest of the labeled arguments following this flag |
| + | -h displays help |
| + | </div> |
| + | </div> |
| + | |
| + | === Decompose=== |
| + | |
| + | <div> |
| + | [http://genome.sph.umich.edu/wiki/Variant_Normalization Normalize] variants in a [http://www.1000genomes.org/wiki/analysis/variant-call-format/vcf-variant-call-format-version-42 VCF] file. Normalized variants may have their positions changed; in such cases, the normalized variants |
| + | are reordered and output in an ordered fashion. The local reordering takes place over a window |
| + | of 10000 base pairs. |
| + | </div> |
| + | |
| + | <div class=" mw-collapsible mw-collapsed"> |
| + | #decomposes multiallelic variants into biallelic variants and write out to gatk.decomposed.vcf |
| + | vt decompose gatk.vcf -o gatk.decomposed.vcf |
| | | |
| #variants that are normalized will be annotated with an OLD_VARIANT info tag. | | #variants that are normalized will be annotated with an OLD_VARIANT info tag. |