Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,703 bytes added ,  04:36, 4 May 2021
Line 17: Line 17:  
   #change directory to vt
 
   #change directory to vt
 
   2. cd vt <br>
 
   2. cd vt <br>
 +
  #update submodules
 +
  3. git submodule update --init --recursive <br>
 
   #run make, note that compilers need to support the c++0x standard  
 
   #run make, note that compilers need to support the c++0x standard  
   3. make <br>
+
   4. make <br>
 
   #you can test the build
 
   #you can test the build
   4. make test
+
   5. make test
 
   <div class=" mw-collapsible mw-collapsed">
 
   <div class=" mw-collapsible mw-collapsed">
 
   An expected output when all is well for the tests is shown here. (click expand =>)
 
   An expected output when all is well for the tests is shown here. (click expand =>)
Line 54: Line 56:  
   </div>
 
   </div>
   −
Building has been tested on Linux and Mac systems on gcc 4.8.1 and clang 3.4. <br>
+
=== Mac ===
Some features of C++11 is used, thus there is a need for newer versions of gcc and clang.
+
 
 +
You may install vt via homebrew.
   −
== Mac ==
+
  brew tap brewsci/bio
 +
  brew tap brewsci/science
 +
 
 +
  brew install brewsci/bio/vt
   −
You may also install vt on mac via homebrew.
     −
  brew install homebrew/science/vt
+
Building has been tested on Linux and Mac systems on gcc 4.8.1 and clang 3.4. <br>
 +
Some features of C++11 are used, thus there is a need for newer versions of gcc and clang.
    
= Updating =
 
= Updating =
Line 458: Line 464:  
There is now an additional option -a which decomposes non block substitutions into its constituent SNPs and indels. (kindly added by [[https://github.com/holtgrewe holtgrewe@github]]) <br>
 
There is now an additional option -a which decomposes non block substitutions into its constituent SNPs and indels. (kindly added by [[https://github.com/holtgrewe holtgrewe@github]]) <br>
 
There is no exact solution and this decomposition is based on the best guess outcome using a Needleman-Wunsch algorithm. <br>
 
There is no exact solution and this decomposition is based on the best guess outcome using a Needleman-Wunsch algorithm. <br>
You might also want to check out [https://github.com/vcflib/vcflib#vcfallelicprimitives vcfallelicprimitives].
+
You might also want to check out [https://github.com/vcflib/vcflib#vcfallelicprimitives vcfallelicprimitives]. <br>
 +
<br>
 +
There is now an additional option -m and -d which ensures that some MNVs are not decomposed. (kindly added by [[https://github.com/jaudoux jaudoux@github]]) <br>
 +
The motivation is from<br>
 +
*Exome-wide assessment of the functional impact and pathogenicity of multi-nucleotide mutations https://www.biorxiv.org/content/10.1101/258723v2.full<br>
 +
*Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes https://www.biorxiv.org/content/10.1101/573378v2.full<br>
 
</div>
 
</div>
   Line 494: Line 505:  
   description : decomposes biallelic block substitutions into its constituent SNPs. <br>
 
   description : decomposes biallelic block substitutions into its constituent SNPs. <br>
 
   usage : vt decompose_blocksub [options] <in.vcf> <br>
 
   usage : vt decompose_blocksub [options] <in.vcf> <br>
   options : -a  enable aggressive/alignment mode
+
   options : -m  keep MNVs (multi-nucleotide variants) [false]
 +
            -a  enable aggressive/alignment mode [false]
 +
            -d  MNVs max distance (when -m option is used) [2]
 
             -o  output VCF file [-]
 
             -o  output VCF file [-]
 
             -I  file containing list of intervals []
 
             -I  file containing list of intervals []
 
             -i  intervals []
 
             -i  intervals []
             -?  displays help
+
             -?  displays help-a  enable aggressive/alignment mode
 +
 
 
</div>
 
</div>
 
</div>
 
</div>
Line 719: Line 733:     
<div class=" mw-collapsible mw-collapsed">
 
<div class=" mw-collapsible mw-collapsed">
   #converts in.bcf to tab format with selected INFO fields
+
   #converts in.bcf to tab format with selected INFO and FILTER fields
   vt info2tab in.bcf -v -t EX_RL,FZ_RL,MDUST,LOBSTR,VNTRSEEK,RMSK,EX_REPEAT_TRACT
+
   vt info2tab in.bcf -u PASS -t EX_RL,FZ_RL,MDUST,LOBSTR,VNTRSEEK,RMSK,EX_REPEAT_TRACT
 
   
   <div style="height:6em; overflow:auto; border: 2px solid #FFF">
 
   <div style="height:6em; overflow:auto; border: 2px solid #FFF">
 +
  INPUT
 +
  =====
 
   20 17548608 . A AC . PASS CENTERS=vbi;NCENTERS=1;OLD_MULTIALLELIC=20:17548598:GAAAAAAAAAAAAA/GAAAAAAAAAAAA/GAAAAAAAAAAAAAA/GAAAAAAAAAA/GAAAAAAAAAAA/GAAAAAAAAAACAAA;OLD_VARIANT=20:17548598:GAAAAAAAAAAAAAG/GAAAAAAAAAACAAAG;EX_MOTIF=C;EX_MLEN=1;EX_RU=C;EX_BASIS=C;EX_BLEN=1;EX_REPEAT_TRACT=17548608,17548609;EX_COMP=100,0,0,0;EX_ENTROPY=0;EX_ENTROPY2=0;EX_KL_DIVERGENCE=2;EX_KL_DIVERGENCE2=4;EX_REF=2;EX_RL=2;EX_LL=3;EX_RU_COUNTS=0,2;EX_SCORE=0;EX_TRF_SCORE=-14;FZ_MOTIF=A;FZ_MLEN=1;FZ_RU=A;FZ_BASIS=A;FZ_BLEN=1;FZ_REPEAT_TRACT=17548599,17548611;FZ_COMP=100,0,0,0;FZ_ENTROPY=0;FZ_ENTROPY2=0;FZ_KL_DIVERGENCE=2;FZ_KL_DIVERGENCE2=4;FZ_REF=13;FZ_RL=13;FZ_LL=14;FZ_RU_COUNTS=13,13;FZ_SCORE=1;FZ_TRF_SCORE=26;FLANKSEQ=GAAAAAAAAA[A]AAAGAAGGAA;MDUST;LOBSTR
 
   20 17548608 . A AC . PASS CENTERS=vbi;NCENTERS=1;OLD_MULTIALLELIC=20:17548598:GAAAAAAAAAAAAA/GAAAAAAAAAAAA/GAAAAAAAAAAAAAA/GAAAAAAAAAA/GAAAAAAAAAAA/GAAAAAAAAAACAAA;OLD_VARIANT=20:17548598:GAAAAAAAAAAAAAG/GAAAAAAAAAACAAAG;EX_MOTIF=C;EX_MLEN=1;EX_RU=C;EX_BASIS=C;EX_BLEN=1;EX_REPEAT_TRACT=17548608,17548609;EX_COMP=100,0,0,0;EX_ENTROPY=0;EX_ENTROPY2=0;EX_KL_DIVERGENCE=2;EX_KL_DIVERGENCE2=4;EX_REF=2;EX_RL=2;EX_LL=3;EX_RU_COUNTS=0,2;EX_SCORE=0;EX_TRF_SCORE=-14;FZ_MOTIF=A;FZ_MLEN=1;FZ_RU=A;FZ_BASIS=A;FZ_BLEN=1;FZ_REPEAT_TRACT=17548599,17548611;FZ_COMP=100,0,0,0;FZ_ENTROPY=0;FZ_ENTROPY2=0;FZ_KL_DIVERGENCE=2;FZ_KL_DIVERGENCE2=4;FZ_REF=13;FZ_RL=13;FZ_LL=14;FZ_RU_COUNTS=13,13;FZ_SCORE=1;FZ_TRF_SCORE=26;FLANKSEQ=GAAAAAAAAA[A]AAAGAAGGAA;MDUST;LOBSTR
 
   20 17548608 . AAAAG A . PASS CENTERS=ox1;NCENTERS=1;EX_MOTIF=AAAG;EX_MLEN=4;EX_RU=AAAG;EX_BASIS=AG;EX_BLEN=2;EX_REPEAT_TRACT=17548609,17548612;EX_COMP=100,0,0,0;EX_ENTROPY=0;EX_ENTROPY2=0;EX_KL_DIVERGENCE=2;EX_KL_DIVERGENCE2=4;EX_REF=0.75;EX_RL=4;EX_LL=4;EX_RU_COUNTS=0,1;EX_SCORE=0.75;EX_TRF_SCORE=-1;FZ_MOTIF=A;FZ_MLEN=1;FZ_RU=A;FZ_BASIS=A;FZ_BLEN=1;FZ_REPEAT_TRACT=17548599,17548611;FZ_COMP=100,0,0,0;FZ_ENTROPY=0;FZ_ENTROPY2=0;FZ_KL_DIVERGENCE=2;FZ_KL_DIVERGENCE2=4;FZ_REF=13;FZ_RL=13;FZ_LL=13;FZ_RU_COUNTS=13,13;FZ_SCORE=1;FZ_TRF_SCORE=26;FLANKSEQ=GAAAAAAAAA[AAAAG]AAGGAACTAC;MDUST;LOBSTR;OLD_VARIANT=20:17548598:GAAAAAAAAAAAAAG/GAAAAAAAAAA
 
   20 17548608 . AAAAG A . PASS CENTERS=ox1;NCENTERS=1;EX_MOTIF=AAAG;EX_MLEN=4;EX_RU=AAAG;EX_BASIS=AG;EX_BLEN=2;EX_REPEAT_TRACT=17548609,17548612;EX_COMP=100,0,0,0;EX_ENTROPY=0;EX_ENTROPY2=0;EX_KL_DIVERGENCE=2;EX_KL_DIVERGENCE2=4;EX_REF=0.75;EX_RL=4;EX_LL=4;EX_RU_COUNTS=0,1;EX_SCORE=0.75;EX_TRF_SCORE=-1;FZ_MOTIF=A;FZ_MLEN=1;FZ_RU=A;FZ_BASIS=A;FZ_BLEN=1;FZ_REPEAT_TRACT=17548599,17548611;FZ_COMP=100,0,0,0;FZ_ENTROPY=0;FZ_ENTROPY2=0;FZ_KL_DIVERGENCE=2;FZ_KL_DIVERGENCE2=4;FZ_REF=13;FZ_RL=13;FZ_LL=13;FZ_RU_COUNTS=13,13;FZ_SCORE=1;FZ_TRF_SCORE=26;FLANKSEQ=GAAAAAAAAA[AAAAG]AAGGAACTAC;MDUST;LOBSTR;OLD_VARIANT=20:17548598:GAAAAAAAAAAAAAG/GAAAAAAAAAA
   
   </div>
 
   </div>
 
+
  OUTPUT
   CHROM POS   REF   ALT N_ALLELE  EX_RL  FZ_RL MDUST LOBSTR VNTRSEEK  RMSK EX_REPEAT_TRACT_1 EX_REPEAT_TRACT_2
+
  ======
   20 17548608  A   AC 2        2 13 1 1 0   0    17548608                17548608
+
   CHROM POS   REF   ALT N_ALLELE PASS EX_RL  FZ_RL MDUST LOBSTR VNTRSEEK  RMSK EX_REPEAT_TRACT_1 EX_REPEAT_TRACT_2
   20 17548608  AAAAG  A 2        4      13     1 1      0        0    17548609                17548609
+
   20 17548608  A   AC 2        1    2     13 1 1 0   0    17548608                17548608
 +
   20 17548608  AAAAG  A 2        1    4      13       1       1      0        0    17548609                17548609
    
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
 
   usage : vt info2tab [options] <in.vcf>
 
   usage : vt info2tab [options] <in.vcf>
 
    
 
    
   options : -v  print variant CHROM,POS,REF,ALT,N_ALLELE [false]
+
   options : -d  debug [false]
            -d  debug [false]
   
             -f  filter expression []
 
             -f  filter expression []
             -t  list of info tags to be extracted []
+
             -u  list of filter tags to be extracted []-t  list of info tags to be extracted []
 
             -o  output tab delimited file [-]
 
             -o  output tab delimited file [-]
 
             -I  file containing list of intervals []
 
             -I  file containing list of intervals []
Line 1,450: Line 1,464:  
</div>
 
</div>
   −
=== Remove overlap ===
+
=== Filter overlap ===
    
Removes overlapping variants in a VCF file by tagging such variants with the FILTER flag overlap.
 
Removes overlapping variants in a VCF file by tagging such variants with the FILTER flag overlap.
   −
<div class=" mw-collapsible mw-collapsed">
+
<div class="mw-collapsible mw-collapsed">
 
   #annotates variants that are overlapping   
 
   #annotates variants that are overlapping   
   vt remove_overlap in.vcf -r hs37d5.fa -o overlapped.tagged..vcf
+
   vt filter_overlap in.vcf -r hs37d5.fa -o overlapped.tagged..vcf
    
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
   usage : vt remove_overlap [options] <in.vcf>
+
   usage : vt filter_overlap [options] <in.vcf>
    
   options : -o  output VCF file [-]
 
   options : -o  output VCF file [-]
 +
            -w  window overlap for variants [0]
 
             -I  file containing list of intervals []
 
             -I  file containing list of intervals []
 
             -i  intervals []
 
             -i  intervals []
 
             -?  displays help
 
             -?  displays help
 +
</div>
 +
</div>
 +
 +
<div class="mw-collapsible mw-collapsed">
 +
  #Use Remove overlap instead for versions older than Jan 12, 2017
 +
  vt remove_overlap in.vcf -r hs37d5.fa -o overlapped.tagged..vcf
 +
 +
<div class="mw-collapsible-content">
 +
    usage: vt remove_overlap [options] <in.vcf>
 +
    The old version has the same options except that it lacks the -w option
 +
    The change occurred in the following commit:
 +
    https://github.com/atks/vt/commit/ab5cf7e91b3baa5349f439e6fe92491ae19da1a6
 
  </div>
 
  </div>
 
</div>
 
</div>
Line 1,738: Line 1,765:  
! scope="col"| Description
 
! scope="col"| Description
 
! scope="col"| Valid Values
 
! scope="col"| Valid Values
 +
! scope="col"| Missing Values
 
|-
 
|-
 
|Family ID<br>
 
|Family ID<br>
Line 1,751: Line 1,779:  
Sex of the individual<br>
 
Sex of the individual<br>
 
Phenotype  
 
Phenotype  
|[A-Za-z_]+<br>
+
|[A-Za-z0-9_]+<br>
[A-Za-z_]+(,[A-Za-z_]+)* <br>
+
[A-Za-z0-9_]+(,[A-Za-z0-9_]+)* <br>
[A-Za-z_]+ <br>
+
[A-Za-z0-9_]+ <br>
[A-Za-z_]+<br>
+
[A-Za-z0-9_]+<br>
 
1=male, 2=female, other, male, female<br>
 
1=male, 2=female, other, male, female<br>
[A-Za-z_]+
+
[A-Za-z0-9_]+
 +
|  0 <br>
 +
cannot be missing <br>
 +
0 <br>
 +
0 <br>
 +
other<br>
 +
-9
 
|}
 
|}
    
   Examples:     
 
   Examples:     
   −
     ceu      NA12878    NA12891    NA12892    female
+
     ceu      NA12878    NA12891    NA12892    female   -9
     yri      NA19240    NA19239    NA19238    female
+
     yri      NA19240    NA19239    NA19238    female   -9
   −
     ceu      NA12878    NA12891    NA12892    2
+
     ceu      NA12878    NA12891    NA12892    2     -9
     yri      NA19240    NA19239    NA19238    2
+
     yri      NA19240    NA19239    NA19238    2     -9
 +
 
 +
    #allows tools like profile_mendelian to detect duplicates and check for concordance
 +
    ceu      NA12878,NA12878A    NA12891    NA12892    female  case
 +
    yri      NA19240            NA19239    NA19238    female  control
    
     #allows tools like profile_mendelian to detect duplicates and check for concordance
 
     #allows tools like profile_mendelian to detect duplicates and check for concordance
     ceu      NA12878,NA12878A   NA12891    NA12892     female
+
     ceu      NA12412   0  0     female case
     yri      NA19240            NA19239    NA19238     female
+
     yri      NA19650    0  0     female control
    
= Resource Bundle =
 
= Resource Bundle =
1,102

edits

Navigation menu