Summary Statistics Files Specification for RAREMETAL and rvtests
Summary Files Specification for RAREMETAL
RAREMETAL use summary statistics files to perform meta-analysis. These includes (1) score statistics file and (2) covariance file. Both RAREMETALWORKER and rvtests can generate these summary statistics files.
The aim of this wiki page is to explain file formats.
Input score test statistics file
Format
Header lines begins with '#' or 'CHROM'. After header part, we listed the meaning of each column as following:
- CHROM Chromosome
- POS Position
- REF Reference Allele
- ALT Alternative Allele
- N_INFORMATIVE Count of individuals with genotype and phenotype
- FOUNDER_AF(RAREMETALWORKER only) Allele frequency among founders
- ALL_AF(RAREMETALWORKER only) Allele frequency across entire sample
- AF(rvtests only) Allele frequency (for related samples, this is adjusted allele frequency)
- INFORMATIVE_ALT_AC Copies of the rare allele among samples with genotype and phenotype
- CALL_RATE Fraction of called genotypes
- HWE_PVALUE Exact Hardy-Weinberg equilibrium p-value
- N_REF Count of reference homozygotes
- N_HET Count of heterozygotes
- N_ALT Count of alternative allele homozygotes
- U_STAT Score statistic numerator
- SQRT_V_STAT Score statistic denominator
- ALT_EFF_SIZE Estimated effect size
- PVALUE P-value
RAREMETALWORKER Example
RAREMETALWORKER generates prefix.traitName.singlevar.score.txt, e.g. prefix.HDL.singlevar.score.txt
##ProgramName=RareMetalWorker ##Version=0.3.4 ##Samples=11619 ##AnalyzedSamples=5916 ##Families=1257 ##AnalyzedFamilies=1117 ##Founders=3611 ##AnalyzedFounders=1023 ##Covariates=AGE,AGE2,SEX ##CovariateSummaries min 25th median 75th max mean variance ##AGE 14 29.2 41.9 57 101.3 43.5354 311.366 ##AGE2 196 852.64 1755.61 3249 10261.7 2206.65 2.77293e+06 ##SEX 1 1 2 2 2 1.5764 0.244204 ##InverseNormal=ON ##TraitSummaries min 25th median 75th max mean variance ##TRAIT 27.4 101.822 125.036 149.922 330.482 127.381 1264.63 ##AnalyzedTrait -3.7613 -0.674224 0.000211852 0.674756 3.7613 -2.32127e-12 0.999947 ##Heritability=33.953% #CHROM POS REF ALT N_INFORMATIVE FOUNDER_AF ALL_AF INFORMATIVE_ALT_AC CALL_RATE HWE_PVALUE N_REF N_HET N_ALT U_STAT SQRT_V_STAT ALT_EFFSIZE PVALUE SE 1 762320 C T 5916 0.00635386 0.00388776 46 1 1 5870 46 0 3.42523 6.57314 0.0792763 0.602301 0.152149 1 865545 G A 5916 0.000488759 8.45166e-05 1 1 1 5915 1 0 1.32644 1.07745 1.14259 0.21829 0.928141 1 865584 G A 5916 0 0 0 1 1 5916 0 0 NA NA NA NA NA 1 865628 G A 5916 0.00782014 0.00566261 67 1 1 5849 67 0 16.4313 7.82519 0.268338 0.0357464 0.127813
Rvtests Example
Rvtests generates prefix.MetaScore.assoc, e.g. prefix.MetaScore.assoc
##Samples=2659 ##AnalyzedSamples=2659 ##Families=2659 ##AnalyzedFamilies=2659 ##Founders=2659 ##AnalyzedFounders=2659 ##InverseNormal=ON ##TraitSummary min 25th median 75th max mean variance ##Trait -3.56023 -0.679098 -0.00304512 0.682994 3.56845 0.000710616 1.00065 ##AnalyzedTrait -3.55632 -0.674786 0 0.674786 3.55632 -1.10229e-17 0.999884 ##Covariates=sex,age,age2,pc1,pc2 ##CovariateSummary min 25th median 75th max mean variance ##sex 1 1 1 2 2 1.34524 0.226135 ##age 21 55 66 73 91 63.6491 150.565 ##age2 441 3025 4356 5329 8281 4201.72 2.25458e+06 ##pc1 -0.1946 -0.0041 0.0014 0.0064 0.0246 -0.00027815 0.000184625 ##pc2 -0.0957 -0.0067 -0.0005 0.0062 0.0989 0.000239526 0.000188235 CHROM POS REF ALT N_INFORMATIVE AF INFORMATIVE_ALT_AC CALL_RATE HWE_PVALUE N_REF N_HET N_ALT U_STAT SQRT_V_STAT ALT_EFFSIZE PVALUE 1 564766 T C 2659 0.23223 1235 1 0 2041 1 617 20.546 43.5254 0.0108453 0.636894 1 564862 T C 2659 NA NA NA NA NA NA NA NA NA NA NA 1 565111 T C 2659 0.0161715 86 1 4.01334e-96 2616 0 43 0.457052 13.0052 0.00270229 0.971965
Input covariance test statistic file
Covariance test statistics contains the LD matrix among a variant and the adjacent markers within a prefixed-sized window. The default window size is 1MB. It has the following format:
Format
- CHROM Chromosome
- CURRENT_POS (RAREMETALWORKER only) Position for the first marker in Window
- VAR_POS_IN_WIND (RAREMETALWORKER only) Position for the other markers in window, separated by commas
- COV_MATRICES (RAREMETALWORKER only) Covariance matrix between test statistics
- START_POS (rvtests only) Position for the first marker in sliding window
- END_POS (rvtests only) Position for the last marker in sliding window
- NUM_MARKER (rvtests only) Number of markers in the sliding windows
- MARKER_POS (rvtests only) Number of marker positions in the sliding windows
- COV (rvtests only) Covariance matrix between test statistics
RAREMETALWORKER Example
RAREMETALWORKER generates prefix.traitName.singlevar.cov.txt, e.g. prefix.HDL.singlevar.cov.txt
CHR POS VAR_POS_IN_WINDOW LD_MATRIX 1 762320 762320,865628,865665,878744,879381,1560000 0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077, 1 865628 865628,865665,878744,879381,1560000,1864659 0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183, 1 878744 878744,879381,1560000,1864659,1877659 0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05,
Rvtests Example
Rvtests generates prefix.MetaCov.assoc.gz, e.g. prefix.HDL.MetaCov.assoc.gz Later version of rvtests will also generate tabix index file, prefix.MetaCov.assoc.gz
CHR START_POS END_POS NUM_MARKER MARKER_POS COV 1 762320 1560000 6 762320,865628,865665,878744,879381,1560000 0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077 1 865628 1864659 6 865628,865665,878744,879381,1560000,1864659 0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183 1 878744 1877659 5 878744,879381,1560000,1864659,1877659 0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05
Contact
Please contact Dajiang Liu ([1]), Xiaowei Zhan ([2]), Shuang Feng([3]) or Goncalo Abecasis ([4]).