The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Brief Introduction
RAREMETALWORKER generates single variant association test statistics for a single study prior to meta-analysis. This page provides a brief description of the statistics that
RAREMETALWORKER calculates, together with key formulae.
Key Statistics for Analysis of Single Study
We use the following notations to describe our methods:
is the observed phenotype vector
is the design matrix
is the vector of covariate effects
is the scalar of fixed genetic effect of the
variant
is the random genetic effects
is the non-shared environmental effects
Single Variant Score Tests
We used the following model for the trait:
.
Here, [explain the formula].
In this model,
is to measure the additive genetic effect of the
variant. As usual, the score statistic for testing
is:
We further derive the variance-covariance matrix of these statistics as
.
Under the null, test statistics
is asymptotically distributed as chi-squared with one degree of freedom.
Summary Statistics and Covariance Matrices
RAREMETALWORKER automatically stores the score statistics for each marker (
) together with quality information of that marker, including HWE p-value, call rate, and allele counts.
RAREMETALWORKER also stores the covariance matrices (
) of the score statistics of markers within a window.
Modeling Relatedness
we use a variance component model to handle familial relationships. In a sample of n individuals, we model the observed phenotype vector (
) as a sum of covariate effects (specified by a design matrix
and a vector of covariate effects
), additive genetic effects (modeled in vector
) and non-shared environmental effects (modeled in vector
). Thus the null model is:
We assume that genetic effects are normally distributed, with mean
and covariance
where the matrix
summarizes kinship coefficients between sampled individuals and
is a positive scalar describing the genetic contribution to the overall variance. We assume that non-shared environmental effects are normally distributed with mean
and covariance
, where
is the identity matrix.
To estimate
, we either use known pedigree structure to define
or else use the empirical estimator
,
where
is the count of variants,
and
are the genotype vector and estimated allele frequency for the
variant, respectively. Each element in
encodes the minor allele count for one individual. Model parameters
,
and
, are estimated using maximum likelihood and the efficient algorithm described in Lippert et. al. For convenience, let the estimated covariance matrix of
be
.
Chromosome X
To analyze markers on chromosome X, we fit an extra variance components
, to model the variance explained by chromosome X. A kinship for chromosome X,
, can be estimated either from a pedigree, or from genotypes of marker from chromosome X. Then the estimated covariance matrix can be written as
.