From Genome Analysis Wiki
Jump to navigationJump to search
19 bytes added
, 11:58, 2 February 2017
Line 1: |
Line 1: |
− | '''vcfCodingSnps'''[http://www.sph.umich.edu/csg/liyanmin/vcfCodingSnps/index.shtml] is a SNP annotation tool that annotates coding variants in a [[VCF]] format input file. It takes a VCF as input and generates an annotated VCF file as output. The tool is currently under development by Yanming Li, a doctoral student at the University of Michigan Center for Statistical Genetics. For any issues with the program, please contact [mailto:liyanmin@umich.edu Yanming]. A detailed tutorial and download page can be found at [http://www.sph.umich.edu/csg/liyanmin/vcfCodingSnps/index.shtml] | + | '''vcfCodingSnps'''[http://csg.sph.umich.edu//liyanmin/vcfCodingSnps/index.shtml] is a SNP annotation tool that annotates coding variants in a [[VCF]] format input file. It takes a VCF as input and generates an annotated VCF file as output. The tool is currently under development by Yanming Li, a doctoral student at the University of Michigan Center for Statistical Genetics. For any issues with the program, please contact [mailto:liyanmin@umich.edu Yanming]. A detailed tutorial and download page can be found at [http://csg.sph.umich.edu//liyanmin/vcfCodingSnps/index.shtml] |
| | | |
| == Basic Usage Example == | | == Basic Usage Example == |
Line 62: |
Line 62: |
| uint[exonCount] exonEnds; "Exon end positions" | | uint[exonCount] exonEnds; "Exon end positions" |
| string symbol; "Standard gene symbol" | | string symbol; "Standard gene symbol" |
− | Note: the 11th field is a mandatory field for running vcfCodingSnps. In the genelists provided with the package, this field gives the standard gene symbols such as "APOE", "LDL-R" etc. If a genelist downloaded by you own that does not contain such a field, you can simply make the 11th field equal to the first field which is the gene name in a specific track by a syntax like
| + | |
− | awk `{FS="\t"; print $0"\t"$1 }` yourGenelist > yourNewGenelist
| + | Note: the 11th field is a mandatory field for running vcfCodingSnps. In the genelists provided with the package, this field gives the standard gene symbols such as "APOE", "LDL-R" etc. |
| + | If a genelist downloaded by you own that does not contain such a field, you can simply make the 11th field equal to the first field which is the gene name in a specific track by a syntax like |
| + | |
| + | awk `{FS="\t"; print $0"\t"$1 }` yourGenelist > yourNewGenelist |
| + | |
| 2. If gene file assumes an [http://genome.ucsc.edu/FAQ/FAQformat#format9 extended GenePred format], there will be an exctra "exonframe" field. Please refer to [https://lists.soe.ucsc.edu/pipermail/genome/2006-November/012218.html here] for the definition of "exonframe". For some genes, due to translational frame shifts or other | | 2. If gene file assumes an [http://genome.ucsc.edu/FAQ/FAQformat#format9 extended GenePred format], there will be an exctra "exonframe" field. Please refer to [https://lists.soe.ucsc.edu/pipermail/genome/2006-November/012218.html here] for the definition of "exonframe". For some genes, due to translational frame shifts or other |
| reasons, the exonframe might not match what one would compute using mod 3 in counting codons. In such cases, the program will report a warning massage that "number of base pairs between code start and code end is | | reasons, the exonframe might not match what one would compute using mod 3 in counting codons. In such cases, the program will report a warning massage that "number of base pairs between code start and code end is |