Line 1,293: |
Line 1,293: |
| no. of trios : 2 | | no. of trios : 2 |
| no. of variants : 25346 | | no. of variants : 25346 |
− |
| |
− | = Pedigree File =
| |
− |
| |
− | vt understands an augmented version introduced by [hmkang@umich.edu Hyun] of the PED described by [http://zzz.bwh.harvard.edu/plink/data.shtml#ped plink].
| |
− |
| |
− | The pedigree file format is as follows with the following mandatory fields:
| |
− |
| |
− | {| class="wikitable"
| |
− | |-
| |
− | ! scope="col"| Field
| |
− | ! scope="col"| Description
| |
− | ! scope="col"| Valid Values
| |
− | |-
| |
− | |Family ID<br>
| |
− | Individual ID<br>
| |
− | Paternal ID<br>
| |
− | Maternal ID<br>
| |
− | Sex<br>
| |
− | Phenotype
| |
− | |ID of this family <br>
| |
− | ID of this individual <br>
| |
− | ID of the father <br>
| |
− | ID of the mother <br>
| |
− | Sex of the individual. <br>
| |
− | Phenotype.
| |
− | |[A-Za-z_]+<br>
| |
− | [A-Za-z_]+ <br>
| |
− | [A-Za-z_]+ <br>
| |
− | [A-Za-z_]+<br>
| |
− | 1 = male, 2 = female and other = alternative<br>
| |
− | [A-Za-z_]+
| |
− | |}
| |
− |
| |
− | Family ID
| |
− | Individual ID
| |
− | Paternal ID
| |
− | Maternal ID
| |
− | Sex (1=male; 2=female; other=unknown)
| |
− | Phenotype
| |
− |
| |
− |
| |
− |
| |
− | ceu NA12878 NA12891 NA12892 female
| |
− | yri NA19240 NA19239 NA19238 female
| |
− |
| |
− |
| |
− | ceu NA12878,NA12878A NA12891 NA12892 female
| |
− | yri NA19240 NA19239 NA19238 female
| |
− |
| |
− | ceu NA12878,NA12878A NA12891 NA12892 0
| |
− | yri NA19240 NA19239 NA19238 0
| |
− |
| |
− |
| |
− |
| |
− | <div class="mw-collapsible-content">
| |
− | profile_mendelian v0.5
| |
− |
| |
− | usage : vt profile_mendelian [options] <in.vcf>
| |
− |
| |
− | options : -q minimum genotype quality
| |
− | -d minimum depth
| |
− | -r reference sequence fasta file []
| |
− | -x output latex directory []
| |
− | -p pedigree file
| |
− | -I file containing list of intervals []
| |
− | -i intervals
| |
− | -? displays help
| |
− | </div>
| |
− | </div>
| |
− |
| |
− | === Profile NA12878 ===
| |
− |
| |
− | Profile Mendelian errors
| |
− |
| |
− | <div class=" mw-collapsible mw-collapsed">
| |
− | #profile NA12878 overlap with broad knowledgebase and illumina platinum genomes for the file vt.genotypes.bcf for chromosome 20.
| |
− | vt profile_na12878 vt.genotypes.bcf -g na12878.reference.txt -r hs37d5.fa -i 20
| |
− |
| |
− | #this is a sample output for mendelian error profiling.
| |
− | #R and A stand for reference and alternate allele respectively.
| |
− | #Error% - mendelian error (confounded with de novo mutation)
| |
− | #HomHet - Homozygous-Heterozygous genotype ratios
| |
− | #Het% - proportion of hets
| |
− | data set
| |
− | No Indels : 27770 [0.94]
| |
− | FS/NFS : 0.26 (8/23) <br>
| |
− | broad.kb
| |
− | A-B 13071 [1.19]
| |
− | A&B 14699 [0.76]
| |
− | B-A 21546 [0.62]
| |
− | Precision 52.9%
| |
− | Sensitivity 40.6% <br>
| |
− | illumina.platinum
| |
− | A-B 17952 [0.88]
| |
− | A&B 9818 [1.07]
| |
− | B-A 2418 [0.88]
| |
− | Precision 35.4%
| |
− | Sensitivity 80.2% <br>
| |
− | broad.kb
| |
− | R/R R/A A/A ./.
| |
− | R/R 346 145 3 5473
| |
− | R/A 3 4133 9 758
| |
− | A/A 2 136 2186 956
| |
− | ./. 2 139 86 322 <br>
| |
− | Total genotype pairs : 6963
| |
− | Concordance : 95.72% (6665)
| |
− | Discordance : 4.28% (298) <br>
| |
− | illumina.platinum
| |
− | R/R R/A A/A ./.
| |
− | R/R 1768 85 2 0
| |
− | R/A 10 4479 14 0
| |
− | A/A 13 180 3028 0
| |
− | ./. 71 98 70 0<br>
| |
− | Total genotype pairs : 9579
| |
− | Concordance : 96.83% (9275)
| |
− | Discordance : 3.17% (304)
| |
− |
| |
− | # This file contains information on how to process reference data sets.
| |
− | #
| |
− | # dataset - name of data set, this label will be printed.
| |
− | # type - True Positives (TP) and False Positives (FP)
| |
− | # overlap percentages labeled as (Precision, Sensitivity) and (False Discovery Rate, Type I Error) respectively
| |
− | # - annotation
| |
− | # file is used for GENCODE annotation of frame shift and non frame shift Indels
| |
− | # filter - filter applied to variants for this particular data set
| |
− | # path - path of indexed BCF file
| |
− | #dataset type filter path
| |
− | broad.kb TP PASS /net/fantasia/home/atks/dev/vt/bundle/public/grch37/broad.kb.241365variants.genotypes.bcf
| |
− | illumina.platinum TP PASS /net/fantasia/home/atks/dev/vt/bundle/public/grch37/NA12878.illumina.platinum.5284448variants.genotypes.bcf
| |
− | #gencode.v19 annotation . /net/fantasia/home/atks/dev/vt/bundle/public/grch37/gencode.v19.annotation.gtf.gz
| |
− | <div class="mw-collapsible-content">
| |
− | profile_na12878 v0.5
| |
− |
| |
− | usage : vt profile_na12878 [options] <in.vcf>
| |
− |
| |
− | options : -g file containing list of reference datasets []
| |
− | -I file containing list of intervals []
| |
− | -i intervals []
| |
− | -r reference sequence fasta file []
| |
− | -? displays help
| |
− | </div>
| |
− | </div>
| |
| | | |
| = Variant Calling = | | = Variant Calling = |