Difference between revisions of "Summary Statistics Files Specification for RAREMETAL and rvtests"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 50: Line 50:
 
  ##InverseNormal=ON
 
  ##InverseNormal=ON
 
  ##TraitSummaries        min    25th    median  75th    max    mean    variance
 
  ##TraitSummaries        min    25th    median  75th    max    mean    variance
  ##LDL   27.4    101.822 125.036 149.922 330.482 127.381 1264.63
+
  ##TRAIT   27.4    101.822 125.036 149.922 330.482 127.381 1264.63
 
  ##AnalyzedTrait -3.7613 -0.674224      0.000211852    0.674756        3.7613  -2.32127e-12    0.999947
 
  ##AnalyzedTrait -3.7613 -0.674224      0.000211852    0.674756        3.7613  -2.32127e-12    0.999947
 
  ##Heritability=33.953%
 
  ##Heritability=33.953%

Revision as of 12:51, 27 November 2013

Summary Files Specification for RAREMETAL

RAREMETAL use summary statistics files to perform meta-analysis. These includes (1) score statistics file and (2) covariance file. Both RAREMETALWORKER and rvtests can generate these summary statistics files.

The aim of this wiki page is to explain file formats.

Input score test statistics file

Format

Header lines begins with '#' or 'CHROM'. After header part, we listed the meaning of each column as following:

  1. CHROM Chromosome
  2. POS Position
  3. REF Reference Allele
  4. ALT Alternative Allele
  5. N_INFORMATIVE Count of individuals with genotype and phenotype
  6. FOUNDER_AF(RAREMETALWORKER only) Allele frequency among founders
  7. ALL_AF(RAREMETALWORKER only) Allele frequency across entire sample
  8. AF(rvtests only) Allele frequency (for related samples, this is adjusted allele frequency)
  9. INFORMATIVE_ALT_AC Copies of the rare allele among samples with genotype and phenotype
  10. CALL_RATE Fraction of called genotypes
  11. HWE_PVALUE Exact Hardy-Weinberg equilibrium p-value
  12. N_REF Count of reference homozygotes
  13. N_HET Count of heterozygotes
  14. N_ALT Count of alternative allele homozygotes
  15. U_STAT Score statistic numerator
  16. SQRT_V_STAT Score statistic denominator
  17. ALT_EFF_SIZE Estimated effect size
  18. PVALUE P-value

RAREMETALWORKER Example

RAREMETALWORKER generates prefix.traitName.singlevar.score.txt, e.g. prefix.HDL.singlevar.score.txt

##ProgramName=RareMetalWorker
##Version=0.3.4
##Samples=11619
##AnalyzedSamples=5916
##Families=1257
##AnalyzedFamilies=1117
##Founders=3611
##AnalyzedFounders=1023
##Covariates=AGE,AGE2,SEX
##CovariateSummaries    min     25th    median  75th    max     mean    variance
##AGE   14      29.2    41.9    57      101.3   43.5354 311.366
##AGE2  196     852.64  1755.61 3249    10261.7 2206.65 2.77293e+06
##SEX   1       1       2       2       2       1.5764  0.244204
##InverseNormal=ON
##TraitSummaries        min     25th    median  75th    max     mean    variance
##TRAIT   27.4    101.822 125.036 149.922 330.482 127.381 1264.63
##AnalyzedTrait -3.7613 -0.674224       0.000211852     0.674756        3.7613  -2.32127e-12    0.999947
##Heritability=33.953%
#CHROM  POS     REF     ALT     N_INFORMATIVE   FOUNDER_AF      ALL_AF  INFORMATIVE_ALT_AC      CALL_RATE       HWE_PVALUE      N_REF   N_HET   N_ALT   U_STAT  SQRT_V_STAT     ALT_EFFSIZE     PVALUE  SE
1       762320  C       T       5916    0.00635386      0.00388776      46      1       1       5870    46      0       3.42523 6.57314 0.0792763       0.602301        0.152149
1       865545  G       A       5916    0.000488759     8.45166e-05     1       1       1       5915    1       0       1.32644 1.07745 1.14259 0.21829 0.928141
1       865584  G       A       5916    0       0       0       1       1       5916    0       0       NA      NA      NA      NA      NA
1       865628  G       A       5916    0.00782014      0.00566261      67      1       1       5849    67      0       16.4313 7.82519 0.268338        0.0357464       0.127813

Rvtests Example

Rvtests generates prefix.MetaScore.assoc, e.g. prefix.MetaScore.assoc

 ##Samples=2659
 ##AnalyzedSamples=2659
 ##Families=2659
 ##AnalyzedFamilies=2659
 ##Founders=2659
 ##AnalyzedFounders=2659
 ##InverseNormal=ON
 ##TraitSummary  min     25th    median  75th    max     mean    variance
 ##Trait -3.56023        -0.679098       -0.00304512     0.682994        3.56845 0.000710616     1.00065
 ##AnalyzedTrait -3.55632        -0.674786       0       0.674786        3.55632 -1.10229e-17    0.999884
 ##Covariates=sex,age,age2,pc1,pc2
 ##CovariateSummary      min     25th    median  75th    max     mean    variance
 ##sex   1       1       1       2       2       1.34524 0.226135
 ##age   21      55      66      73      91      63.6491 150.565
 ##age2  441     3025    4356    5329    8281    4201.72 2.25458e+06
 ##pc1   -0.1946 -0.0041 0.0014  0.0064  0.0246  -0.00027815     0.000184625
 ##pc2   -0.0957 -0.0067 -0.0005 0.0062  0.0989  0.000239526     0.000188235
 CHROM   POS     REF     ALT     N_INFORMATIVE   AF      INFORMATIVE_ALT_AC      CALL_RATE       HWE_PVALUE      N_REF   N_HET   N_ALT   U_STAT  SQRT_V_STAT     ALT_EFFSIZE     PVALUE
 1       564766  T       C       2659    0.23223 1235    1       0       2041    1       617     20.546  43.5254 0.0108453       0.636894
 1       564862  T       C       2659    NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
 1       565111  T       C       2659    0.0161715       86      1       4.01334e-96     2616    0       43      0.457052        13.0052 0.00270229      0.971965

Input covariance test statistic file

Covariance test statistics contains the LD matrix among a variant and the adjacent markers within a prefixed-sized window. The default window size is 1MB. It has the following format:

Format

  1. CHROM Chromosome
  2. CURRENT_POS (RAREMETALWORKER only) Position for the first marker in Window
  3. VAR_POS_IN_WIND (RAREMETALWORKER only) Position for the other markers in window, separated by commas
  4. COV_MATRICES (RAREMETALWORKER only) Covariance matrix between test statistics
  5. START_POS (rvtests only) Position for the first marker in sliding window
  6. END_POS (rvtests only) Position for the last marker in sliding window
  7. NUM_MARKER (rvtests only) Number of markers in the sliding windows
  8. MARKER_POS (rvtests only) Number of marker positions in the sliding windows
  9. COV (rvtests only) Covariance matrix between test statistics

RAREMETALWORKER Example

RAREMETALWORKER generates prefix.traitName.singlevar.cov.txt, e.g. prefix.HDL.singlevar.cov.txt

 CHR    POS        VAR_POS_IN_WINDOW                             LD_MATRIX
 1   762320     762320,865628,865665,878744,879381,1560000    0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077,
 1   865628     865628,865665,878744,879381,1560000,1864659   0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183,
 1   878744     878744,879381,1560000,1864659,1877659         0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05,

Rvtests Example

Rvtests generates prefix.MetaCov.assoc.gz, e.g. prefix.HDL.MetaCov.assoc.gz Later version of rvtests will also generate tabix index file, prefix.MetaCov.assoc.gz

 CHR  START_POS       END_POS  NUM_MARKER        MARKER_POS                             COV
 1   762320         1560000         6     762320,865628,865665,878744,879381,1560000    0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077
 1   865628         1864659         6     865628,865665,878744,879381,1560000,1864659   0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183
 1   878744         1877659         5     878744,879381,1560000,1864659,1877659         0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05

Contact

Please contact Dajiang Liu ([1]), Xiaowei Zhan ([2]), Shuang Feng([3]) or Goncalo Abecasis ([4]).