From Genome Analysis Wiki
Jump to navigationJump to search
2,566 bytes added
, 22:04, 7 August 2013
Line 119: |
Line 119: |
| | | |
| == Additional Analysis Options == | | == Additional Analysis Options == |
| + | |
| + | === Group Rare Variants from Annotated VCF === |
| + | * If --groupFile option is '''NOT''' specified, '''rareMETAL''' will look for an annotated vcf file as blue print for variants to group. |
| + | * The annotated VCF file should be specified using --annotatedVcf option. |
| + | * --annotation should be used with --annotatedVcf together when specific category of functional variants are of interest to be grouped. For example, if grouping nonsynonymous and splicing variants are of interests, the following should be included in command line: |
| + | |
| + | --annotatedVcf your.annotated.vcf --annotation nonsyn/splicing |
| + | Note: this allows you to group variants that are annotated starting with nonsyn or splicing (not case-sensitive). |
| + | |
| + | * Special format for the annotated VCF file is required: all annotation information should be coded in INFO field in VCF file, starting with the key "ANNO=". An example annotated VCF file is in the following: |
| + | |
| + | #CHROM POS ID REF ALT QUAL FILTER INFO |
| + | 1 19208194 . G A 100 PASS |
| + | AC=3;'''ANNO='''nonsynonymous:ALDH4A1:NM_170726:exon8:c.C866T:p.P289L,ALDH4A1:NM_001161504:exon8:c.C686T:p.P229L,ALDH4A1:NM_003748:exon8:c.C866T:p.P289L,; |
| + | '''ANNO='''splicing:ALDH4A1 |
| + | 1 19208293 . G C 100 PASS AC=7;STUDIES=5;MAC=7;MAF=0.001;DESIGN=TBD_ASSAY;DSCORE=1.00; |
| + | '''ANNO='''nonsynonymous:ALDH4A1:NM_170726:exon8:c.C767G:p.P256R,ALDH4A1:NM_001161504:exon8:c.C587G:p.P196R,ALDH4A1:NM_003748:exon8:c.C767G:p.P256R, |
| + | |
| + | * Notice that each variant is allowed to have more than one annotations; but each annotation should start with a new key "ANNO=" followed by annotation:genename:other transcript information. |
| + | |
| + | === Generate a VCF File to Annotate Outside of Rare Metal === |
| + | * --writeVCF allows user to write a VCF file including pooled single variants from all studies. Then users can use their favorite annotation tool to annotate the VCF file. After annotating the VCF file, users can use that file as input for '''rareMETAL''' for further gene-based or region-based meta analysis. |
| + | * The output vcf file will be name as: yourPrefix.pooled.variants.vcf. An example output vcf file is in the following: |
| + | #CHROM POS ID REF ALT QUAL FILTER INFO |
| + | 1 115658497 115658497 G A . . ALT_AF=0.380906; |
| + | 2 74688884 74688884 G A . . ALT_AF=8.33611e-05; |
| + | 3 121414217 121414217 C A . . ALT_AF=0.0747833; |
| + | |
| + | ===Report Options === |
| * --tabix allows fast analysis when number of groups/genes of interests are less than 100. | | * --tabix allows fast analysis when number of groups/genes of interests are less than 100. |
| * --prefix allows customized prefix for output files. | | * --prefix allows customized prefix for output files. |