Power Calculations: Quantitative Traits

From Genome Analysis Wiki
Jump to navigationJump to search

Calculating power for simple study designs is one of the most common tasks faced by a Biostatistician. However, for most of them, it barely makes the list of the top 1000 things they might enjoy doing. So, make your neighborhood biostatistician marginally happier by calculating your own power tables before you next meet with them (yes, even if you can calculate power, you will probably still need professional advice on study design!).

A Simple Genetic Association Study

In this example, we will use R to carry a simple power calculation for a genetic association study. We will assume that you are interested in a quantitative trait and that you have phenotyped and genptyped N randomly sampled individuals. Further, we will assume that you are using signficance level for your analyses (typically, where M the number of independent markers examined) and H2 is the variance explained by additive effects at the marker of interest under an additive model.

Thus, we are assuming that you will ultimately analyze your data with a linear model such as , with denoting the expected phenotypic value for each individual, denotine an overall mean, denoting the estimated per genotype effect, and denoting the observed genotype for each individual (coded as 0, 1 or 2 according to the number of copies of the rare alelele).

In this setting, a simple power calculation might look like:

  N = 1000
  alpha = 0.05
  H2 = 0.01
  threshold = qchisq(alpha, df = 1, lower.tail = FALSE)
  power = pchisq(threshold, df = 1, lower.tail = FALSE, ncp = N * H2)

Considerations for Related Samples

The above calculation assumes that you are studying a sample of unrelated individuals. If your sample includes related individuals, the effective sample size will typically be lower than the number of genotyped individuals and power will typically be lower. The loss in power depends on the heritability of the trait (there will be a greater loss in power for more heritable traits) and on the relatedness of individuals (there will be a greater loss in power for more closely related individuals).

On the other hand, when analyzing samples that include related individuals in an association study, it may not be necessary to exhaustively genotype each sample -- resulting in substantial cost savings and increased power on a per genotype basis. For examples of the possibilities, see Chen and Abecasis (2007).