Summary Statistics Files Specification for RAREMETAL and rvtests

From Genome Analysis Wiki
Revision as of 01:24, 24 November 2013 by Zhanxw (talk | contribs)
Jump to navigationJump to search

Summary Files Specification for RAREMETAL

RAREMETAL use summary statistics files to perform meta-analysis. These includes (1) score statistics file and (2) covariance file. Both RAREMETALWORKER and rvtests can generate these summary statistics files.

The aim of this wiki page is to explain file formats.

Input score test statistics file

Format

Header lines begins with '#' or 'CHROM'. After header part, we listed the meaning of each column as following:

  1. CHROM Chromosome
  2. POS Position
  3. REF Reference Allele
  4. ALT Alternative Allele
  5. N_INFORMATIVE Count of individuals with genotype and phenotype
  6. FOUNDER_AF(RAREMETALWORKER only) Allele frequency among founders
  7. ALL_AF(RAREMETALWORKER only) Allele frequency across entire sample
  8. AF(rvtests only) Allele frequency (for related samples, this is adjusted allele frequency)
  9. INFORMATIVE_ALT_AC Copies of the rare allele among samples with genotype and phenotype
  10. CALL_RATE Fraction of called genotypes
  11. HWE_PVALUE Exact Hardy-Weinberg equilibrium p-value
  12. N_REF Count of reference homozygotes
  13. N_HET Count of heterozygotes
  14. N_ALT Count of alternative allele homozygotes
  15. U_STAT Score statistic numerator
  16. SQRT_V_STAT Score statistic denominator
  17. ALT_EFF_SIZE Estimated effect size
  18. PVALUE P-value

RAREMETALWORKER

RAREMETALWORKER generates prefix.traitName.singlevar.score.txt, e.g. prefix.HDL.singlevar.score.txt

 ##ProgramName=RareMetalWorker
 ##Version=0.0.7
 ##Samples=2778
 ##AnalyzedSamples=2778
 ##Families=2778
 ##AnalyzedFamilies=2778
 ##Founders=2778
 ##AnalyzedFounders=2778
 ##Covariates=sex,age,age2,pc1,pc2
 ##CovariateSummaries    min     25th    median  75th    max     mean    variance
 ##sex   1       1       1       2       2       1.33873 0.224074
 ##age   21      54      65      73      91      63.0734 157.295
 ##age2  441     2916    4225    5329    8281    4135.5  2.33363e+06
 ##pc1   -0.1834 -0.0039 0.0016  0.0068  0.026   0.000368719     0.000147096
 ##pc2   -0.0942 -0.007  -0.001  0.0053  0.0912  -0.000418035    0.000150827
 ##InverseNormal=ON
 ##TraitSummaries        min     25th    median  75th    max     mean    variance
 ##HDL   -3.5797 -0.6737 -0.0052 0.6694  3.5797  0.000389273     0.997184
 ##AnalyzedTrait -3.56781        -0.67449        -0.000451157    0.673357        3.56781 3.01766e-11     0.999888
 ##Heritability=0%
 #CHROM  POS     REF     ALT     N_INFORMATIVE   FOUNDER_AF      ALL_AF  INFORMATIVE_ALT_AC      CALL_RATE       HWE_PVALUE      N_REF   N_HET   N_ALT   U_STAT  SQRT_V_STAT     ALT_EFFSIZE     PVALUE
 1       564766  T       C       2778    0.220122        0.220122        1223    1       0       2166    1       611     -51.7512        43.6541 -0.0271435      0.235827
 1       564862  T       C       2778    0       0       5556    1       1       0       0       2778    0       0       nan     nan
 1       565111  T       C       2778    0.0170986       0.0170986       95      1       4.19613e-102    2730    1       47      24.0897 13.6258 0.129688        0.0770708

Rvtests

Rvtests generates prefix.MetaScore.assoc, e.g. prefix.MetaScore.assoc

 ##Samples=2659
 ##AnalyzedSamples=2659
 ##Families=2659
 ##AnalyzedFamilies=2659
 ##Founders=2659
 ##AnalyzedFounders=2659
 ##InverseNormal=ON
 ##TraitSummary  min     25th    median  75th    max     mean    variance
 ##Trait -3.56023        -0.679098       -0.00304512     0.682994        3.56845 0.000710616     1.00065
 ##AnalyzedTrait -3.55632        -0.674786       0       0.674786        3.55632 -1.10229e-17    0.999884
 ##Covariates=sex,age,age2,pc1,pc2
 ##CovariateSummary      min     25th    median  75th    max     mean    variance
 ##sex   1       1       1       2       2       1.34524 0.226135
 ##age   21      55      66      73      91      63.6491 150.565
 ##age2  441     3025    4356    5329    8281    4201.72 2.25458e+06
 ##pc1   -0.1946 -0.0041 0.0014  0.0064  0.0246  -0.00027815     0.000184625
 ##pc2   -0.0957 -0.0067 -0.0005 0.0062  0.0989  0.000239526     0.000188235
 CHROM   POS     REF     ALT     N_INFORMATIVE   AF      INFORMATIVE_ALT_AC      CALL_RATE       HWE_PVALUE      N_REF   N_HET   N_ALT   U_STAT  SQRT_V_STAT     ALT_EFFSIZE     PVALUE
 1       564766  T       C       2659    0.23223 1235    1       0       2041    1       617     20.546  43.5254 0.0108453       0.636894
 1       564862  T       C       2659    NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
 1       565111  T       C       2659    0.0161715       86      1       4.01334e-96     2616    0       43      0.457052        13.0052 0.00270229      0.971965


Input covariance test statistic file

Covariance test statistics contains the LD matrix among a variant and the adjacent markers within a prefixed-sized window. The default window size is 1MB. It has the following format:

Format

  1. CHROM Chromosome
  2. CURRENT_POS (RAREMETALWORKER only) Position for the first marker in Window
  3. VAR_POS_IN_WIND (RAREMETALWORKER only) Position for the other markers in window, separated by commas
  4. COV_MATRICES (RAREMETALWORKER only) Covariance matrix between test statistics
  5. START_POS (rvtests only) Position for the first marker in sliding window
  6. END_POS (rvtests only) Position for the last marker in sliding window
  7. NUM_MARKER (rvtests only) Number of markers in the sliding windows
  8. MARKER_POS (rvtests only) Number of marker positions in the sliding windows
  9. COV (rvtests only) Covariance matrix between test statistics

RAREMETALWORKER

RAREMETALWORKER generates prefix.traitName.singlevar.cov.txt, e.g. prefix.HDL.singlevar.cov.txt

 CHR    POS        VAR_POS_IN_WINDOW                             LD_MATRIX
 1   762320     762320,865628,865665,878744,879381,1560000    0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077,
 1   865628     865628,865665,878744,879381,1560000,1864659   0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183,
 1   878744     878744,879381,1560000,1864659,1877659         0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05,

Rvtests

Rvtests generates prefix.MetaCov.assoc.gz, e.g. prefix.HDL.MetaCov.assoc.gz Later version of rvtests will also generate tabix index file, prefix.MetaCov.assoc.gz

 CHR  START_POS       END_POS  NUM_MARKER        MARKER_POS                             COV
 1   762320         1560000         6     762320,865628,865665,878744,879381,1560000    0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077
 1   865628         1864659         6     865628,865665,878744,879381,1560000,1864659   0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183
 1   878744         1877659         5     878744,879381,1560000,1864659,1877659         0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05

Contact

Please contact Dajiang Liu ([1]), Xiaowei Zhan ([2]), Shuang Feng([3]) or Goncalo Abecasis ([4]).