Changes

From Genome Analysis Wiki
Jump to navigationJump to search
4,788 bytes added ,  22:37, 7 November 2017
Line 1,293: Line 1,293:  
   no. of trios    : 2
 
   no. of trios    : 2
 
   no. of variants  : 25346
 
   no. of variants  : 25346
 +
 +
 +
 
 +
<div class="mw-collapsible-content">
 +
profile_mendelian v0.5
 +
 +
  usage : vt profile_mendelian [options] <in.vcf>
 +
 +
  options : -q  minimum genotype quality
 +
            -d  minimum depth
 +
            -r  reference sequence fasta file []
 +
            -x  output latex directory []
 +
            -p  pedigree file
 +
            -I  file containing list of intervals []
 +
            -i  intervals
 +
          -?  displays help
 +
</div>
 +
</div>
 +
 +
=== Profile NA12878 ===
 +
 +
Profile Mendelian errors
 +
 +
<div class=" mw-collapsible mw-collapsed">
 +
  #profile NA12878 overlap with broad knowledgebase and illumina platinum genomes for the file vt.genotypes.bcf for chromosome 20.
 +
  vt profile_na12878  vt.genotypes.bcf -g na12878.reference.txt -r hs37d5.fa -i 20
 +
 +
  #this is a sample output for mendelian error profiling.
 +
  #R and A stand for reference and alternate allele respectively.
 +
  #Error% - mendelian error (confounded with de novo mutation)
 +
  #HomHet - Homozygous-Heterozygous genotype ratios
 +
  #Het% - proportion of hets
 +
    data set
 +
    No Indels :      27770 [0.94]
 +
      FS/NFS :      0.26 (8/23) <br>
 +
  broad.kb
 +
    A-B      13071 [1.19]
 +
    A&B      14699 [0.76]
 +
    B-A      21546 [0.62]
 +
    Precision    52.9%
 +
    Sensitivity  40.6% <br>
 +
  illumina.platinum
 +
    A-B      17952 [0.88]
 +
    A&B      9818 [1.07]
 +
    B-A      2418 [0.88]
 +
    Precision    35.4%
 +
    Sensitivity  80.2% <br>
 +
  broad.kb
 +
                R/R      R/A      A/A      ./.
 +
    R/R        346      145        3      5473
 +
    R/A          3      4133        9      758
 +
    A/A          2      136      2186      956
 +
    ./.          2      139        86      322 <br>
 +
    Total genotype pairs :      6963
 +
    Concordance          :  95.72% (6665)
 +
    Discordance          :  4.28% (298) <br>
 +
  illumina.platinum
 +
                R/R      R/A      A/A      ./.
 +
    R/R        1768        85        2        0
 +
    R/A          10      4479        14        0
 +
    A/A          13      180      3028        0
 +
    ./.          71        98        70        0<br>
 +
    Total genotype pairs :      9579
 +
    Concordance          :  96.83% (9275)
 +
    Discordance          :  3.17% (304)
 +
 +
  # This file contains information on how to process reference data sets.
 +
  #
 +
  # dataset - name of data set, this label will be printed.
 +
  # type    - True Positives (TP) and False Positives (FP)
 +
  #          overlap percentages labeled as (Precision, Sensitivity) and (False Discovery Rate, Type I Error) respectively
 +
  #        - annotation
 +
  #          file is used for GENCODE annotation of frame shift and non frame shift Indels
 +
  # filter  - filter applied to variants for this particular data set
 +
  # path    - path of indexed BCF file
 +
  #dataset              type        filter    path
 +
  broad.kb              TP          PASS      /net/fantasia/home/atks/dev/vt/bundle/public/grch37/broad.kb.241365variants.genotypes.bcf
 +
  illumina.platinum    TP          PASS      /net/fantasia/home/atks/dev/vt/bundle/public/grch37/NA12878.illumina.platinum.5284448variants.genotypes.bcf
 +
  #gencode.v19          annotation  .        /net/fantasia/home/atks/dev/vt/bundle/public/grch37/gencode.v19.annotation.gtf.gz
 +
<div class="mw-collapsible-content">
 +
profile_na12878 v0.5
 +
 +
  usage : vt profile_na12878 [options] <in.vcf>
 +
 +
  options : -g  file containing list of reference datasets []
 +
            -I  file containing list of intervals []
 +
            -i  intervals []
 +
            -r  reference sequence fasta file []
 +
            -?  displays help
 +
</div>
 +
</div>
 +
 +
= Pedigree File =
 +
 +
  vt understands an augmented version introduced by [hmkang@umich.edu Hyun] of the PED described by [http://zzz.bwh.harvard.edu/plink/data.shtml#ped plink].
 +
 
 +
  The pedigree file format is as follows with the following mandatory fields:
 +
       
 +
{| class="wikitable"
 +
|-
 +
! scope="col"| Field
 +
! scope="col"| Description
 +
! scope="col"| Valid Values
 +
|-
 +
|Family ID<br>
 +
Individual ID<br>
 +
Paternal ID<br>
 +
Maternal ID<br>
 +
Sex<br>
 +
Phenotype
 +
|ID of this family <br>
 +
ID of this individual <br>
 +
ID of the father <br>
 +
ID of the mother <br>
 +
Sex of the individual.  <br>
 +
Phenotype.
 +
|[A-Za-z_]+<br>
 +
[A-Za-z_]+ <br>
 +
[A-Za-z_]+ <br>
 +
[A-Za-z_]+<br>
 +
1 = male, 2 = female and other = alternative<br>
 +
[A-Za-z_]+
 +
|}
 +
 +
    Family ID
 +
    Individual ID
 +
    Paternal ID
 +
    Maternal ID
 +
    Sex (1=male; 2=female; other=unknown)
 +
    Phenotype
 +
   
 +
 
 +
 +
    ceu NA12878   NA12891 NA12892     female
 +
    yri      NA19240    NA19239    NA19238    female
 +
 +
 +
    ceu NA12878,NA12878A   NA12891 NA12892     female
 +
    yri      NA19240                            NA19239    NA19238    female
 +
 +
    ceu NA12878,NA12878A   NA12891 NA12892     0
 +
    yri      NA19240                            NA19239    NA19238    0
    
= Variant Calling =
 
= Variant Calling =
1,102

edits

Navigation menu