Changes

From Genome Analysis Wiki
Jump to navigationJump to search
4,794 bytes removed ,  22:34, 7 November 2017
Line 1,293: Line 1,293:  
   no. of trios    : 2
 
   no. of trios    : 2
 
   no. of variants  : 25346
 
   no. of variants  : 25346
  −
= Pedigree File =
  −
  −
  vt understands an augmented version introduced by [hmkang@umich.edu Hyun] of the PED described by [http://zzz.bwh.harvard.edu/plink/data.shtml#ped plink].
  −
 
  −
  The pedigree file format is as follows with the following mandatory fields:
  −
       
  −
{| class="wikitable"
  −
|-
  −
! scope="col"| Field
  −
! scope="col"| Description
  −
! scope="col"| Valid Values
  −
|-
  −
|Family ID<br>
  −
Individual ID<br>
  −
Paternal ID<br>
  −
Maternal ID<br>
  −
Sex<br>
  −
Phenotype
  −
|ID of this family <br>
  −
ID of this individual <br>
  −
ID of the father <br>
  −
ID of the mother <br>
  −
Sex of the individual.  <br>
  −
Phenotype.
  −
|[A-Za-z_]+<br>
  −
[A-Za-z_]+ <br>
  −
[A-Za-z_]+ <br>
  −
[A-Za-z_]+<br>
  −
1 = male, 2 = female and other = alternative<br>
  −
[A-Za-z_]+
  −
|}
  −
  −
    Family ID
  −
    Individual ID
  −
    Paternal ID
  −
    Maternal ID
  −
    Sex (1=male; 2=female; other=unknown)
  −
    Phenotype
  −
   
  −
 
  −
  −
    ceu NA12878   NA12891 NA12892     female
  −
    yri      NA19240    NA19239    NA19238    female
  −
  −
  −
    ceu NA12878,NA12878A   NA12891 NA12892     female
  −
    yri      NA19240                            NA19239    NA19238    female
  −
  −
    ceu NA12878,NA12878A   NA12891 NA12892     0
  −
    yri      NA19240                            NA19239    NA19238    0
  −
  −
     
  −
 
  −
<div class="mw-collapsible-content">
  −
profile_mendelian v0.5
  −
  −
  usage : vt profile_mendelian [options] <in.vcf>
  −
  −
  options : -q  minimum genotype quality
  −
            -d  minimum depth
  −
            -r  reference sequence fasta file []
  −
            -x  output latex directory []
  −
            -p  pedigree file
  −
            -I  file containing list of intervals []
  −
            -i  intervals
  −
          -?  displays help
  −
</div>
  −
</div>
  −
  −
=== Profile NA12878 ===
  −
  −
Profile Mendelian errors
  −
  −
<div class=" mw-collapsible mw-collapsed">
  −
  #profile NA12878 overlap with broad knowledgebase and illumina platinum genomes for the file vt.genotypes.bcf for chromosome 20.
  −
  vt profile_na12878  vt.genotypes.bcf -g na12878.reference.txt -r hs37d5.fa -i 20
  −
  −
  #this is a sample output for mendelian error profiling.
  −
  #R and A stand for reference and alternate allele respectively.
  −
  #Error% - mendelian error (confounded with de novo mutation)
  −
  #HomHet - Homozygous-Heterozygous genotype ratios
  −
  #Het% - proportion of hets
  −
    data set
  −
    No Indels :      27770 [0.94]
  −
      FS/NFS :      0.26 (8/23) <br>
  −
  broad.kb
  −
    A-B      13071 [1.19]
  −
    A&B      14699 [0.76]
  −
    B-A      21546 [0.62]
  −
    Precision    52.9%
  −
    Sensitivity  40.6% <br>
  −
  illumina.platinum
  −
    A-B      17952 [0.88]
  −
    A&B      9818 [1.07]
  −
    B-A      2418 [0.88]
  −
    Precision    35.4%
  −
    Sensitivity  80.2% <br>
  −
  broad.kb
  −
                R/R      R/A      A/A      ./.
  −
    R/R        346      145        3      5473
  −
    R/A          3      4133        9      758
  −
    A/A          2      136      2186      956
  −
    ./.          2      139        86      322 <br>
  −
    Total genotype pairs :      6963
  −
    Concordance          :  95.72% (6665)
  −
    Discordance          :  4.28% (298) <br>
  −
  illumina.platinum
  −
                R/R      R/A      A/A      ./.
  −
    R/R        1768        85        2        0
  −
    R/A          10      4479        14        0
  −
    A/A          13      180      3028        0
  −
    ./.          71        98        70        0<br>
  −
    Total genotype pairs :      9579
  −
    Concordance          :  96.83% (9275)
  −
    Discordance          :  3.17% (304)
  −
  −
  # This file contains information on how to process reference data sets.
  −
  #
  −
  # dataset - name of data set, this label will be printed.
  −
  # type    - True Positives (TP) and False Positives (FP)
  −
  #          overlap percentages labeled as (Precision, Sensitivity) and (False Discovery Rate, Type I Error) respectively
  −
  #        - annotation
  −
  #          file is used for GENCODE annotation of frame shift and non frame shift Indels
  −
  # filter  - filter applied to variants for this particular data set
  −
  # path    - path of indexed BCF file
  −
  #dataset              type        filter    path
  −
  broad.kb              TP          PASS      /net/fantasia/home/atks/dev/vt/bundle/public/grch37/broad.kb.241365variants.genotypes.bcf
  −
  illumina.platinum    TP          PASS      /net/fantasia/home/atks/dev/vt/bundle/public/grch37/NA12878.illumina.platinum.5284448variants.genotypes.bcf
  −
  #gencode.v19          annotation  .        /net/fantasia/home/atks/dev/vt/bundle/public/grch37/gencode.v19.annotation.gtf.gz
  −
<div class="mw-collapsible-content">
  −
profile_na12878 v0.5
  −
  −
  usage : vt profile_na12878 [options] <in.vcf>
  −
  −
  options : -g  file containing list of reference datasets []
  −
            -I  file containing list of intervals []
  −
            -i  intervals []
  −
            -r  reference sequence fasta file []
  −
            -?  displays help
  −
</div>
  −
</div>
      
= Variant Calling =
 
= Variant Calling =
1,102

edits

Navigation menu