Open main menu

Genome Analysis Wiki β

Changes

VcfCodingSnps

280 bytes added, 02:24, 13 December 2010
no edit summary
Go to http://genome.ucsc.edu/ ►► Click "table" ►► Specify the fields required (clade: mammal, genome:human etc.) ►► In "track" filed, select "UCSC gene" ►► get output gene file
1. Gene file used should be of [http://genome.ucsc.edu/FAQ/FAQformat#format9 GenePred table format]. The following 11 tab delimited fields are required and must be of the same order as shown below:
string name; "Name of gene"
string chrom; "Chromosome name"
string symbol; "Standard gene symbol"
Note: the 11th field is a mandatory field for running vcfCodingSnps. In the genelists provided with the package, this field gives the standard gene symbols such as "APOE", "LDL-R"etc. If a genelist downloaded by you own that does not contain such a field, you can simply make the 11th field equal to the first field which is the gene name in a specific track by a syntax like  awk `{FS="\t"; print $0"\t"$1 }` yourGenelist > yourNewGenelist
2. If gene file assumes an [http://genome.ucsc.edu/FAQ/FAQformat#format9 extended GenePred format], there will be an exctra "exonframe" field. Please refer to [https://lists.soe.ucsc.edu/pipermail/genome/2006-November/012218.html here] for the definition of "exonframe". For some genes, due to translational frame shifts or other
76
edits