Line 17: |
Line 17: |
| #change directory to vt | | #change directory to vt |
| 2. cd vt <br> | | 2. cd vt <br> |
| + | #update submodules |
| + | 3. git submodule update --init --recursive <br> |
| #run make, note that compilers need to support the c++0x standard | | #run make, note that compilers need to support the c++0x standard |
− | 3. make <br> | + | 4. make <br> |
| #you can test the build | | #you can test the build |
− | 4. make test | + | 5. make test |
| <div class=" mw-collapsible mw-collapsed"> | | <div class=" mw-collapsible mw-collapsed"> |
| An expected output when all is well for the tests is shown here. (click expand =>) | | An expected output when all is well for the tests is shown here. (click expand =>) |
Line 54: |
Line 56: |
| </div> | | </div> |
| | | |
− | Building has been tested on Linux and Mac systems on gcc 4.8.1 and clang 3.4. <br>
| + | === Mac === |
− | Some features of C++11 is used, thus there is a need for newer versions of gcc and clang.
| |
| | | |
− | == Mac ==
| + | You may install vt via homebrew. |
| | | |
− | You may also install vt on mac via homebrew.
| + | brew tap brewsci/bio |
| + | brew tap brewsci/science |
| + | |
| + | brew install brewsci/bio/vt |
| | | |
− | brew install homebrew/science/vt
| + | |
| + | Building has been tested on Linux and Mac systems on gcc 4.8.1 and clang 3.4. <br> |
| + | Some features of C++11 are used, thus there is a need for newer versions of gcc and clang. |
| | | |
| = Updating = | | = Updating = |
Line 458: |
Line 464: |
| There is now an additional option -a which decomposes non block substitutions into its constituent SNPs and indels. (kindly added by [[https://github.com/holtgrewe holtgrewe@github]]) <br> | | There is now an additional option -a which decomposes non block substitutions into its constituent SNPs and indels. (kindly added by [[https://github.com/holtgrewe holtgrewe@github]]) <br> |
| There is no exact solution and this decomposition is based on the best guess outcome using a Needleman-Wunsch algorithm. <br> | | There is no exact solution and this decomposition is based on the best guess outcome using a Needleman-Wunsch algorithm. <br> |
− | You might also want to check out [https://github.com/vcflib/vcflib#vcfallelicprimitives vcfallelicprimitives]. | + | You might also want to check out [https://github.com/vcflib/vcflib#vcfallelicprimitives vcfallelicprimitives]. <br> |
| + | <br> |
| + | There is now an additional option -m and -d which ensures that some MNVs are not decomposed. (kindly added by [[https://github.com/jaudoux jaudoux@github]]) <br> |
| + | The motivation is from<br> |
| + | *Exome-wide assessment of the functional impact and pathogenicity of multi-nucleotide mutations https://www.biorxiv.org/content/10.1101/258723v2.full<br> |
| + | *Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes https://www.biorxiv.org/content/10.1101/573378v2.full<br> |
| </div> | | </div> |
| | | |
Line 494: |
Line 505: |
| description : decomposes biallelic block substitutions into its constituent SNPs. <br> | | description : decomposes biallelic block substitutions into its constituent SNPs. <br> |
| usage : vt decompose_blocksub [options] <in.vcf> <br> | | usage : vt decompose_blocksub [options] <in.vcf> <br> |
− | options : -a enable aggressive/alignment mode | + | options : -m keep MNVs (multi-nucleotide variants) [false] |
| + | -a enable aggressive/alignment mode [false] |
| + | -d MNVs max distance (when -m option is used) [2] |
| -o output VCF file [-] | | -o output VCF file [-] |
| -I file containing list of intervals [] | | -I file containing list of intervals [] |
| -i intervals [] | | -i intervals [] |
− | -? displays help | + | -? displays help-a enable aggressive/alignment mode |
| + | |
| </div> | | </div> |
| </div> | | </div> |
Line 719: |
Line 733: |
| | | |
| <div class=" mw-collapsible mw-collapsed"> | | <div class=" mw-collapsible mw-collapsed"> |
− | #converts in.bcf to tab format with selected INFO fields | + | #converts in.bcf to tab format with selected INFO and FILTER fields |
− | vt info2tab in.bcf -v -t EX_RL,FZ_RL,MDUST,LOBSTR,VNTRSEEK,RMSK,EX_REPEAT_TRACT | + | vt info2tab in.bcf -u PASS -t EX_RL,FZ_RL,MDUST,LOBSTR,VNTRSEEK,RMSK,EX_REPEAT_TRACT |
− | | |
| <div style="height:6em; overflow:auto; border: 2px solid #FFF"> | | <div style="height:6em; overflow:auto; border: 2px solid #FFF"> |
| + | INPUT |
| + | ===== |
| 20 17548608 . A AC . PASS CENTERS=vbi;NCENTERS=1;OLD_MULTIALLELIC=20:17548598:GAAAAAAAAAAAAA/GAAAAAAAAAAAA/GAAAAAAAAAAAAAA/GAAAAAAAAAA/GAAAAAAAAAAA/GAAAAAAAAAACAAA;OLD_VARIANT=20:17548598:GAAAAAAAAAAAAAG/GAAAAAAAAAACAAAG;EX_MOTIF=C;EX_MLEN=1;EX_RU=C;EX_BASIS=C;EX_BLEN=1;EX_REPEAT_TRACT=17548608,17548609;EX_COMP=100,0,0,0;EX_ENTROPY=0;EX_ENTROPY2=0;EX_KL_DIVERGENCE=2;EX_KL_DIVERGENCE2=4;EX_REF=2;EX_RL=2;EX_LL=3;EX_RU_COUNTS=0,2;EX_SCORE=0;EX_TRF_SCORE=-14;FZ_MOTIF=A;FZ_MLEN=1;FZ_RU=A;FZ_BASIS=A;FZ_BLEN=1;FZ_REPEAT_TRACT=17548599,17548611;FZ_COMP=100,0,0,0;FZ_ENTROPY=0;FZ_ENTROPY2=0;FZ_KL_DIVERGENCE=2;FZ_KL_DIVERGENCE2=4;FZ_REF=13;FZ_RL=13;FZ_LL=14;FZ_RU_COUNTS=13,13;FZ_SCORE=1;FZ_TRF_SCORE=26;FLANKSEQ=GAAAAAAAAA[A]AAAGAAGGAA;MDUST;LOBSTR | | 20 17548608 . A AC . PASS CENTERS=vbi;NCENTERS=1;OLD_MULTIALLELIC=20:17548598:GAAAAAAAAAAAAA/GAAAAAAAAAAAA/GAAAAAAAAAAAAAA/GAAAAAAAAAA/GAAAAAAAAAAA/GAAAAAAAAAACAAA;OLD_VARIANT=20:17548598:GAAAAAAAAAAAAAG/GAAAAAAAAAACAAAG;EX_MOTIF=C;EX_MLEN=1;EX_RU=C;EX_BASIS=C;EX_BLEN=1;EX_REPEAT_TRACT=17548608,17548609;EX_COMP=100,0,0,0;EX_ENTROPY=0;EX_ENTROPY2=0;EX_KL_DIVERGENCE=2;EX_KL_DIVERGENCE2=4;EX_REF=2;EX_RL=2;EX_LL=3;EX_RU_COUNTS=0,2;EX_SCORE=0;EX_TRF_SCORE=-14;FZ_MOTIF=A;FZ_MLEN=1;FZ_RU=A;FZ_BASIS=A;FZ_BLEN=1;FZ_REPEAT_TRACT=17548599,17548611;FZ_COMP=100,0,0,0;FZ_ENTROPY=0;FZ_ENTROPY2=0;FZ_KL_DIVERGENCE=2;FZ_KL_DIVERGENCE2=4;FZ_REF=13;FZ_RL=13;FZ_LL=14;FZ_RU_COUNTS=13,13;FZ_SCORE=1;FZ_TRF_SCORE=26;FLANKSEQ=GAAAAAAAAA[A]AAAGAAGGAA;MDUST;LOBSTR |
| 20 17548608 . AAAAG A . PASS CENTERS=ox1;NCENTERS=1;EX_MOTIF=AAAG;EX_MLEN=4;EX_RU=AAAG;EX_BASIS=AG;EX_BLEN=2;EX_REPEAT_TRACT=17548609,17548612;EX_COMP=100,0,0,0;EX_ENTROPY=0;EX_ENTROPY2=0;EX_KL_DIVERGENCE=2;EX_KL_DIVERGENCE2=4;EX_REF=0.75;EX_RL=4;EX_LL=4;EX_RU_COUNTS=0,1;EX_SCORE=0.75;EX_TRF_SCORE=-1;FZ_MOTIF=A;FZ_MLEN=1;FZ_RU=A;FZ_BASIS=A;FZ_BLEN=1;FZ_REPEAT_TRACT=17548599,17548611;FZ_COMP=100,0,0,0;FZ_ENTROPY=0;FZ_ENTROPY2=0;FZ_KL_DIVERGENCE=2;FZ_KL_DIVERGENCE2=4;FZ_REF=13;FZ_RL=13;FZ_LL=13;FZ_RU_COUNTS=13,13;FZ_SCORE=1;FZ_TRF_SCORE=26;FLANKSEQ=GAAAAAAAAA[AAAAG]AAGGAACTAC;MDUST;LOBSTR;OLD_VARIANT=20:17548598:GAAAAAAAAAAAAAG/GAAAAAAAAAA | | 20 17548608 . AAAAG A . PASS CENTERS=ox1;NCENTERS=1;EX_MOTIF=AAAG;EX_MLEN=4;EX_RU=AAAG;EX_BASIS=AG;EX_BLEN=2;EX_REPEAT_TRACT=17548609,17548612;EX_COMP=100,0,0,0;EX_ENTROPY=0;EX_ENTROPY2=0;EX_KL_DIVERGENCE=2;EX_KL_DIVERGENCE2=4;EX_REF=0.75;EX_RL=4;EX_LL=4;EX_RU_COUNTS=0,1;EX_SCORE=0.75;EX_TRF_SCORE=-1;FZ_MOTIF=A;FZ_MLEN=1;FZ_RU=A;FZ_BASIS=A;FZ_BLEN=1;FZ_REPEAT_TRACT=17548599,17548611;FZ_COMP=100,0,0,0;FZ_ENTROPY=0;FZ_ENTROPY2=0;FZ_KL_DIVERGENCE=2;FZ_KL_DIVERGENCE2=4;FZ_REF=13;FZ_RL=13;FZ_LL=13;FZ_RU_COUNTS=13,13;FZ_SCORE=1;FZ_TRF_SCORE=26;FLANKSEQ=GAAAAAAAAA[AAAAG]AAGGAACTAC;MDUST;LOBSTR;OLD_VARIANT=20:17548598:GAAAAAAAAAAAAAG/GAAAAAAAAAA |
− |
| |
| </div> | | </div> |
− | | + | OUTPUT |
− | CHROM POS REF ALT N_ALLELE EX_RL FZ_RL MDUST LOBSTR VNTRSEEK RMSK EX_REPEAT_TRACT_1 EX_REPEAT_TRACT_2 | + | ====== |
− | 20 17548608 A AC 2 2 13 1 1 0 0 17548608 17548608 | + | CHROM POS REF ALT N_ALLELE PASS EX_RL FZ_RL MDUST LOBSTR VNTRSEEK RMSK EX_REPEAT_TRACT_1 EX_REPEAT_TRACT_2 |
− | 20 17548608 AAAAG A 2 4 13 1 1 0 0 17548609 17548609 | + | 20 17548608 A AC 2 1 2 13 1 1 0 0 17548608 17548608 |
| + | 20 17548608 AAAAG A 2 1 4 13 1 1 0 0 17548609 17548609 |
| | | |
| <div class="mw-collapsible-content"> | | <div class="mw-collapsible-content"> |
| usage : vt info2tab [options] <in.vcf> | | usage : vt info2tab [options] <in.vcf> |
| | | |
− | options : -v print variant CHROM,POS,REF,ALT,N_ALLELE [false] | + | options : -d debug [false] |
− | -d debug [false]
| |
| -f filter expression [] | | -f filter expression [] |
− | -t list of info tags to be extracted [] | + | -u list of filter tags to be extracted []-t list of info tags to be extracted [] |
| -o output tab delimited file [-] | | -o output tab delimited file [-] |
| -I file containing list of intervals [] | | -I file containing list of intervals [] |
Line 1,450: |
Line 1,464: |
| </div> | | </div> |
| | | |
− | === Remove overlap === | + | === Filter overlap === |
| | | |
| Removes overlapping variants in a VCF file by tagging such variants with the FILTER flag overlap. | | Removes overlapping variants in a VCF file by tagging such variants with the FILTER flag overlap. |
| | | |
− | <div class=" mw-collapsible mw-collapsed"> | + | <div class="mw-collapsible mw-collapsed"> |
| #annotates variants that are overlapping | | #annotates variants that are overlapping |
− | vt remove_overlap in.vcf -r hs37d5.fa -o overlapped.tagged..vcf | + | vt filter_overlap in.vcf -r hs37d5.fa -o overlapped.tagged..vcf |
| | | |
| <div class="mw-collapsible-content"> | | <div class="mw-collapsible-content"> |
− | usage : vt remove_overlap [options] <in.vcf> | + | usage : vt filter_overlap [options] <in.vcf> |
| | | |
| options : -o output VCF file [-] | | options : -o output VCF file [-] |
| + | -w window overlap for variants [0] |
| -I file containing list of intervals [] | | -I file containing list of intervals [] |
| -i intervals [] | | -i intervals [] |
| -? displays help | | -? displays help |
| + | </div> |
| + | </div> |
| + | |
| + | <div class="mw-collapsible mw-collapsed"> |
| + | #Use Remove overlap instead for versions older than Jan 12, 2017 |
| + | vt remove_overlap in.vcf -r hs37d5.fa -o overlapped.tagged..vcf |
| + | |
| + | <div class="mw-collapsible-content"> |
| + | usage: vt remove_overlap [options] <in.vcf> |
| + | The old version has the same options except that it lacks the -w option |
| + | The change occurred in the following commit: |
| + | https://github.com/atks/vt/commit/ab5cf7e91b3baa5349f439e6fe92491ae19da1a6 |
| </div> | | </div> |
| </div> | | </div> |