Changes

From Genome Analysis Wiki
Jump to navigationJump to search
no edit summary
Line 86: Line 86:  
; Estimate Local Ancestry Using GWAS Data
 
; Estimate Local Ancestry Using GWAS Data
 
: For studies that include admixed samples, we should estimate local ancestry using GWAS data. If GWAS are not available, it is strongly recommended that these data should be generated. In principle, local ancestry estimates can be generated even before exome sequencing is complete.
 
: For studies that include admixed samples, we should estimate local ancestry using GWAS data. If GWAS are not available, it is strongly recommended that these data should be generated. In principle, local ancestry estimates can be generated even before exome sequencing is complete.
 +
 +
; Estimate Global Ancestry Covariates Using PCA or MDS Analysis
 +
 +
= Initial Association Analyses =
 +
 +
We anticipate that, at least early on, the initial association analysis of whole genome datasets in the context of complex trait association studies will focus on identifying and resolving quality control issues that might result in unexpected artifacts.
 +
 +
== Initial Single SNP Tests ==
 +
 +
In principle, these tests only have power for common variants. In practice, particularly when permutation based methods are used to assess significance, there should be little loss in power by testing all variants using single SNP tests. These tests should include:
 +
 +
; Logistic Regression Based Tests for Discrete Traits
 +
: Discrete outcomes should be evaluated using logistic regression. It is important to include appropriate covariates. For most traits, these might include age and sex and, potentially, principal components of ancestry.
 +
 +
; Linear Regression Based Tests for Quantitative Traits
 +
: For quantitative traits that are not strongly selected, linear regression based tests can also be used. Again, it is important to include an appropriate set of covariates. For many quantitative traits, it may be a very good idea to normalize traits to minimize the impact of outliers on association results.
 +
 +
; Using genotypes as outcomes for Selected Quantitative Traits
 +
: For both discrete and quantitative traits, analysis can be repeated using genotypes (scored as 0, 1 and 2) as outcomes and phenotypes as predictors.
 +
 +
== Q-Q Plots ==
 +
 +
After carrying out initial single SNP tests, generate Q-Q plots for each analysis. Verify that Q-Q plots are reasonable and that genomic control value is close to 1.0. If not, refine sample and variant filters as needed.
 +
 +
== Burden Tests ==
 +
 +
The same analyses that were originally carried for single variants should be carried out for groups of rare variants. In principle, one could simple use the presence of a rare variant (or a particular class of rare variant, such as a non-synonymous variant or a newly discovered variant) as a predictor and repeat the logistic regression, linear regression or genotype regression described above. For an initial pass, I think the precise form of this analysis is not critical, because the next step is to...
 +
 +
== More Q-Q Plots ===
 +
 +
After carrying out initial burden tests, generate Q-Q plots for each analysis. Verify that Q-Q plots are reasonable and that genomic control value is close to 1.0. If not, refine sample and variant filters as needed.
 +
 +
= Visualize Results ==
 +
 +
A number of displays will likely be useful. Probably these should include:
 +
 +
* Manhattan Plots
 +
* [[LocusZoom]] Plots
 +
* Q-Q Plots
    
= Think You Are Done? =
 
= Think You Are Done? =
   −
'''No way!!!''' You still need a plan to call and evaluate short insertions and deletions as well as larger structural variants.
+
'''No way!!!'''  
   −
== Initial Association Analyses ==
+
== Indels and Structural Variants ==
   −
We anticipate that, at least early on, the initial association analysis of whole genome datasets in the context of complex trait association studies will focus on identifying and resolving quality control issues that might result in unexpected artifacts.
+
You still need a plan to call and evaluate short insertions and deletions as well as larger structural variants.
 +
 
 +
== Pathway Based Analyses ==
 +
 
 +
Carry out analyses that include groups of genes with similar biological function (for example, according to [[Gene Ontology]] or [[Kyoto Encyclopedia of Genes and Genomes]] annotations.

Navigation menu