Summary Statistics Files Specification for RAREMETAL and rvtests
Summary Files Specification for RAREMETAL
RAREMETAL use summary statistics files to perform meta-analysis. These includes (1) score statistics file and (2) covariance file. Both RAREMETALWORKER and rvtests can generate these summary statistics files.
The aim of this wiki page is to explain file formats.
Input score test statistics file
Format
Header lines begins with '#' or 'CHROM'. After header part, we listed the meaning of each column as following:
- CHROM Chromosome
- POS Position
- REF Reference Allele
- ALT Alternative Allele
- N_INFORMATIVE Count of individuals with genotype and phenotype
- FOUNDER_AF(RAREMETALWORKER only) Allele frequency among founders
- ALL_AF(RAREMETALWORKER only) Allele frequency across entire sample
- AF(rvtests only) Allele frequency (for related samples, this is adjusted allele frequency)
- INFORMATIVE_ALT_AC Copies of the rare allele among samples with genotype and phenotype
- CALL_RATE Fraction of called genotypes
- HWE_PVALUE Exact Hardy-Weinberg equilibrium p-value
- N_REF Count of reference homozygotes
- N_HET Count of heterozygotes
- N_ALT Count of alternative allele homozygotes
- U_STAT Score statistic numerator
- SQRT_V_STAT Score statistic denominator
- ALT_EFF_SIZE Estimated effect size
- PVALUE P-value
RAREMETALWORKER Example
RAREMETALWORKER generates prefix.traitName.singlevar.score.txt, e.g. prefix.HDL.singlevar.score.txt
##ProgramName=RareMetalWorker ##Version=0.0.7 ##Samples=2778 ##AnalyzedSamples=2778 ##Families=2778 ##AnalyzedFamilies=2778 ##Founders=2778 ##AnalyzedFounders=2778 ##Covariates=sex,age,age2,pc1,pc2 ##CovariateSummaries min 25th median 75th max mean variance ##sex 1 1 1 2 2 1.33873 0.224074 ##age 21 54 65 73 91 63.0734 157.295 ##age2 441 2916 4225 5329 8281 4135.5 2.33363e+06 ##pc1 -0.1834 -0.0039 0.0016 0.0068 0.026 0.000368719 0.000147096 ##pc2 -0.0942 -0.007 -0.001 0.0053 0.0912 -0.000418035 0.000150827 ##InverseNormal=ON ##TraitSummaries min 25th median 75th max mean variance ##HDL -3.5797 -0.6737 -0.0052 0.6694 3.5797 0.000389273 0.997184 ##AnalyzedTrait -3.56781 -0.67449 -0.000451157 0.673357 3.56781 3.01766e-11 0.999888 ##Heritability=0% #CHROM POS REF ALT N_INFORMATIVE FOUNDER_AF ALL_AF INFORMATIVE_ALT_AC CALL_RATE HWE_PVALUE N_REF N_HET N_ALT U_STAT SQRT_V_STAT ALT_EFFSIZE PVALUE 1 564766 T C 2778 0.220122 0.220122 1223 1 0 2166 1 611 -51.7512 43.6541 -0.0271435 0.235827 1 564862 T C 2778 0 0 5556 1 1 0 0 2778 0 0 nan nan 1 565111 T C 2778 0.0170986 0.0170986 95 1 4.19613e-102 2730 1 47 24.0897 13.6258 0.129688 0.0770708
Rvtests Example
Rvtests generates prefix.MetaScore.assoc, e.g. prefix.MetaScore.assoc
##Samples=2659 ##AnalyzedSamples=2659 ##Families=2659 ##AnalyzedFamilies=2659 ##Founders=2659 ##AnalyzedFounders=2659 ##InverseNormal=ON ##TraitSummary min 25th median 75th max mean variance ##Trait -3.56023 -0.679098 -0.00304512 0.682994 3.56845 0.000710616 1.00065 ##AnalyzedTrait -3.55632 -0.674786 0 0.674786 3.55632 -1.10229e-17 0.999884 ##Covariates=sex,age,age2,pc1,pc2 ##CovariateSummary min 25th median 75th max mean variance ##sex 1 1 1 2 2 1.34524 0.226135 ##age 21 55 66 73 91 63.6491 150.565 ##age2 441 3025 4356 5329 8281 4201.72 2.25458e+06 ##pc1 -0.1946 -0.0041 0.0014 0.0064 0.0246 -0.00027815 0.000184625 ##pc2 -0.0957 -0.0067 -0.0005 0.0062 0.0989 0.000239526 0.000188235 CHROM POS REF ALT N_INFORMATIVE AF INFORMATIVE_ALT_AC CALL_RATE HWE_PVALUE N_REF N_HET N_ALT U_STAT SQRT_V_STAT ALT_EFFSIZE PVALUE 1 564766 T C 2659 0.23223 1235 1 0 2041 1 617 20.546 43.5254 0.0108453 0.636894 1 564862 T C 2659 NA NA NA NA NA NA NA NA NA NA NA 1 565111 T C 2659 0.0161715 86 1 4.01334e-96 2616 0 43 0.457052 13.0052 0.00270229 0.971965
Input covariance test statistic file
Covariance test statistics contains the LD matrix among a variant and the adjacent markers within a prefixed-sized window. The default window size is 1MB. It has the following format:
Format
- CHROM Chromosome
- CURRENT_POS (RAREMETALWORKER only) Position for the first marker in Window
- VAR_POS_IN_WIND (RAREMETALWORKER only) Position for the other markers in window, separated by commas
- COV_MATRICES (RAREMETALWORKER only) Covariance matrix between test statistics
- START_POS (rvtests only) Position for the first marker in sliding window
- END_POS (rvtests only) Position for the last marker in sliding window
- NUM_MARKER (rvtests only) Number of markers in the sliding windows
- MARKER_POS (rvtests only) Number of marker positions in the sliding windows
- COV (rvtests only) Covariance matrix between test statistics
RAREMETALWORKER Example
RAREMETALWORKER generates prefix.traitName.singlevar.cov.txt, e.g. prefix.HDL.singlevar.cov.txt
CHR POS VAR_POS_IN_WINDOW LD_MATRIX 1 762320 762320,865628,865665,878744,879381,1560000 0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077, 1 865628 865628,865665,878744,879381,1560000,1864659 0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183, 1 878744 878744,879381,1560000,1864659,1877659 0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05,
Rvtests Example
Rvtests generates prefix.MetaCov.assoc.gz, e.g. prefix.HDL.MetaCov.assoc.gz Later version of rvtests will also generate tabix index file, prefix.MetaCov.assoc.gz
CHR START_POS END_POS NUM_MARKER MARKER_POS COV 1 762320 1560000 6 762320,865628,865665,878744,879381,1560000 0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077 1 865628 1864659 6 865628,865665,878744,879381,1560000,1864659 0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183 1 878744 1877659 5 878744,879381,1560000,1864659,1877659 0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05
Contact
Please contact Dajiang Liu ([1]), Xiaowei Zhan ([2]), Shuang Feng([3]) or Goncalo Abecasis ([4]).