Difference between revisions of "Summary Statistics Files Specification for RAREMETAL and rvtests"

From Genome Analysis Wiki
Jump to navigationJump to search
(Created page with "Category:Software RAREMETAL use summary statistics files to perform meta-analysis. These includes (1) score statistics file and (2) covariance file. Both RAREMETALWORKE...")
 
(Add rmw category tag)
 
(7 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
[[Category:Software]]
 
[[Category:Software]]
 
+
[[Category:RAREMETAL]]
 +
[[Category:RAREMETALWORKER]]
 +
[[Category:rvtests]]
 +
= Summary Files Specification for RAREMETAL =
 
RAREMETAL use summary statistics files to perform meta-analysis. These includes (1) score statistics file and (2) covariance file.
 
RAREMETAL use summary statistics files to perform meta-analysis. These includes (1) score statistics file and (2) covariance file.
 
Both [[RAREMETALWORKER|RAREMETALWORKER]] and [http://genome.sph.umich.edu/wiki/Rvtests '''rvtests'''] can generate these summary statistics files.
 
Both [[RAREMETALWORKER|RAREMETALWORKER]] and [http://genome.sph.umich.edu/wiki/Rvtests '''rvtests'''] can generate these summary statistics files.
Line 8: Line 11:
 
== Input score test statistics file ==
 
== Input score test statistics file ==
  
* Format
+
=== Format ===
  
 
Header lines begins with '#' or 'CHROM'. After header part, we listed the meaning of each column as following:
 
Header lines begins with '#' or 'CHROM'. After header part, we listed the meaning of each column as following:
Line 31: Line 34:
 
# PVALUE                P-value
 
# PVALUE                P-value
  
* RAREMETALWORKER generates prefix.traitName.singlevar.score.txt, e.g. prefix.HDL.singlevar.score.txt
+
=== RAREMETALWORKER Example ===
 +
 
 +
RAREMETALWORKER generates prefix.traitName.singlevar.score.txt, e.g. prefix.HDL.singlevar.score.txt
 
    
 
    
  ##ProgramName=RareMetalWorker
+
##ProgramName=RareMetalWorker
  ##Version=0.0.7
+
##Version=0.3.4
  ##Samples=2778
+
##Samples=11619
  ##AnalyzedSamples=2778
+
##AnalyzedSamples=5916
  ##Families=2778
+
##Families=1257
  ##AnalyzedFamilies=2778
+
##AnalyzedFamilies=1117
  ##Founders=2778
+
##Founders=3611
  ##AnalyzedFounders=2778
+
##AnalyzedFounders=1023
  ##Covariates=sex,age,age2,pc1,pc2
+
##Covariates=AGE,AGE2,SEX
  ##CovariateSummaries    min    25th    median  75th    max    mean    variance
+
##CovariateSummaries    min    25th    median  75th    max    mean    variance
  ##sex   1      1      1      2       2      1.33873 0.224074
+
##AGE   14      29.2   41.9    57      101.3   43.5354 311.366
   ##age  21      54      65      73      91      63.0734 157.295
+
##AGE2 196     852.64  1755.61 3249   10261.7 2206.65 2.77293e+06
  ##age2 441     2916   4225    5329    8281    4135.2.33363e+06
+
##SEX   1      1      2      2      2      1.5764 0.244204
  ##pc1   -0.1834 -0.0039 0.0016  0.0068 0.026  0.000368719    0.000147096
+
  ##InverseNormal=ON
  ##pc2  -0.0942 -0.007  -0.001  0.0053  0.0912 -0.000418035    0.000150827
+
##TraitSummaries        min    25th    median  75th    max    mean    variance
  ##InverseNormal=ON
+
##TRAIT   27.4    101.822 125.036 149.922 330.482 127.381 1264.63
  ##TraitSummaries        min    25th    median  75th    max    mean    variance
+
##AnalyzedTrait -3.7613 -0.674224      0.000211852    0.674756       3.7613  -2.32127e-12    0.999947
  ##HDL   -3.5797 -0.6737 -0.0052 0.6694  3.5797  0.000389273    0.997184
+
##Heritability=33.953%
  ##AnalyzedTrait -3.56781        -0.67449        -0.000451157    0.673357       3.56781 3.01766e-11    0.999888
+
#CHROM  POS    REF    ALT    N_INFORMATIVE  FOUNDER_AF      ALL_AF  INFORMATIVE_ALT_AC      CALL_RATE      HWE_PVALUE      N_REF  N_HET  N_ALT  U_STAT  SQRT_V_STAT    ALT_EFFSIZE    PVALUE SE
  ##Heritability=0%
+
1      762320 C      T      5916    0.00635386      0.00388776      46      1      1       5870   46      0      3.42523 6.57314 0.0792763      0.602301       0.152149
  #CHROM  POS    REF    ALT    N_INFORMATIVE  FOUNDER_AF      ALL_AF  INFORMATIVE_ALT_AC      CALL_RATE      HWE_PVALUE      N_REF  N_HET  N_ALT  U_STAT  SQRT_V_STAT    ALT_EFFSIZE    PVALUE
+
1      865545  G      A      5916   0.000488759    8.45166e-05    1      1      1       5915   1      0      1.32644 1.07745 1.14259 0.21829 0.928141
  1      564766 T      C       2778   0.220122       0.220122        1223   1      0       2166   1      611    -51.7512        43.6541 -0.0271435      0.235827
+
1      865584 G       A       5916   0      0      0      1      1      5916   0      0      NA      NA      NA      NA      NA
  1      564862 T       C       2778   0      0      5556    1      1      0      0      2778   0      0      nan    nan
+
1      865628 G       A       5916   0.00782014      0.00566261      67     1      1      5849    67     0      16.4313 7.82519 0.268338       0.0357464      0.127813
  1      565111 T       C       2778   0.0170986      0.0170986      95     1      4.19613e-102    2730    1      47     24.0897 13.6258 0.129688       0.0770708
 
  
* rvtests generates prefix.MetaScore.assoc, e.g. prefix.MetaScore.assoc
+
=== Rvtests  Example ===
 +
Rvtests generates prefix.MetaScore.assoc, e.g. prefix.MetaScore.assoc
  
 
   ##Samples=2659
 
   ##Samples=2659
Line 81: Line 86:
 
   1      564862  T      C      2659    NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
 
   1      564862  T      C      2659    NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
 
   1      565111  T      C      2659    0.0161715      86      1      4.01334e-96    2616    0      43      0.457052        13.0052 0.00270229      0.971965
 
   1      565111  T      C      2659    0.0161715      86      1      4.01334e-96    2616    0      43      0.457052        13.0052 0.00270229      0.971965
 
  
 
== Input covariance test statistic file ==
 
== Input covariance test statistic file ==
Line 87: Line 91:
 
Covariance test statistics contains the LD matrix among a variant and the adjacent markers within a prefixed-sized window. The default window size is 1MB. It has the following format:
 
Covariance test statistics contains the LD matrix among a variant and the adjacent markers within a prefixed-sized window. The default window size is 1MB. It has the following format:
  
* Format  
+
=== Format ===
  
 
# CHROM                  Chromosome
 
# CHROM                  Chromosome
Line 99: Line 103:
 
# COV  (rvtests only)                    Covariance matrix between test statistics
 
# COV  (rvtests only)                    Covariance matrix between test statistics
  
* RAREMETALWORKER generates prefix.traitName.singlevar.cov.txt, e.g. prefix.HDL.singlevar.cov.txt
+
=== RAREMETALWORKER Example ===
 +
RAREMETALWORKER generates prefix.traitName.singlevar.cov.txt, e.g. prefix.HDL.singlevar.cov.txt
 
    
 
    
 
   CHR    POS        VAR_POS_IN_WINDOW                            LD_MATRIX
 
   CHR    POS        VAR_POS_IN_WINDOW                            LD_MATRIX
Line 106: Line 111:
 
   1  878744    878744,879381,1560000,1864659,1877659        0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05,
 
   1  878744    878744,879381,1560000,1864659,1877659        0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05,
  
* Rvtests generates prefix.MetaCov.assoc.gz, e.g. prefix.HDL.MetaCov.assoc.gz
+
=== Rvtests Example ===
 +
Rvtests generates prefix.MetaCov.assoc.gz, e.g. prefix.HDL.MetaCov.assoc.gz
 
Later version of rvtests will also generate tabix index file, prefix.MetaCov.assoc.gz
 
Later version of rvtests will also generate tabix index file, prefix.MetaCov.assoc.gz
  
Line 114: Line 120:
 
   1  878744        1877659        5    878744,879381,1560000,1864659,1877659        0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05
 
   1  878744        1877659        5    878744,879381,1560000,1864659,1877659        0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05
  
== Contact ==
+
= Contact =
  
 
Please contact Dajiang Liu ([mailto:dajiang@umich.edu]), Xiaowei Zhan ([mailto:zhanxw@umich.edu]), Shuang Feng([mailto:sfengsph@umich.edu]) or Goncalo Abecasis ([mailto:goncalo@umich.edu]).
 
Please contact Dajiang Liu ([mailto:dajiang@umich.edu]), Xiaowei Zhan ([mailto:zhanxw@umich.edu]), Shuang Feng([mailto:sfengsph@umich.edu]) or Goncalo Abecasis ([mailto:goncalo@umich.edu]).

Latest revision as of 11:39, 29 August 2019

Summary Files Specification for RAREMETAL

RAREMETAL use summary statistics files to perform meta-analysis. These includes (1) score statistics file and (2) covariance file. Both RAREMETALWORKER and rvtests can generate these summary statistics files.

The aim of this wiki page is to explain file formats.

Input score test statistics file

Format

Header lines begins with '#' or 'CHROM'. After header part, we listed the meaning of each column as following:

  1. CHROM Chromosome
  2. POS Position
  3. REF Reference Allele
  4. ALT Alternative Allele
  5. N_INFORMATIVE Count of individuals with genotype and phenotype
  6. FOUNDER_AF(RAREMETALWORKER only) Allele frequency among founders
  7. ALL_AF(RAREMETALWORKER only) Allele frequency across entire sample
  8. AF(rvtests only) Allele frequency (for related samples, this is adjusted allele frequency)
  9. INFORMATIVE_ALT_AC Copies of the rare allele among samples with genotype and phenotype
  10. CALL_RATE Fraction of called genotypes
  11. HWE_PVALUE Exact Hardy-Weinberg equilibrium p-value
  12. N_REF Count of reference homozygotes
  13. N_HET Count of heterozygotes
  14. N_ALT Count of alternative allele homozygotes
  15. U_STAT Score statistic numerator
  16. SQRT_V_STAT Score statistic denominator
  17. ALT_EFF_SIZE Estimated effect size
  18. PVALUE P-value

RAREMETALWORKER Example

RAREMETALWORKER generates prefix.traitName.singlevar.score.txt, e.g. prefix.HDL.singlevar.score.txt

##ProgramName=RareMetalWorker
##Version=0.3.4
##Samples=11619
##AnalyzedSamples=5916
##Families=1257
##AnalyzedFamilies=1117
##Founders=3611
##AnalyzedFounders=1023
##Covariates=AGE,AGE2,SEX
##CovariateSummaries    min     25th    median  75th    max     mean    variance
##AGE   14      29.2    41.9    57      101.3   43.5354 311.366
##AGE2  196     852.64  1755.61 3249    10261.7 2206.65 2.77293e+06
##SEX   1       1       2       2       2       1.5764  0.244204
##InverseNormal=ON
##TraitSummaries        min     25th    median  75th    max     mean    variance
##TRAIT   27.4    101.822 125.036 149.922 330.482 127.381 1264.63
##AnalyzedTrait -3.7613 -0.674224       0.000211852     0.674756        3.7613  -2.32127e-12    0.999947
##Heritability=33.953%
#CHROM  POS     REF     ALT     N_INFORMATIVE   FOUNDER_AF      ALL_AF  INFORMATIVE_ALT_AC      CALL_RATE       HWE_PVALUE      N_REF   N_HET   N_ALT   U_STAT  SQRT_V_STAT     ALT_EFFSIZE     PVALUE  SE
1       762320  C       T       5916    0.00635386      0.00388776      46      1       1       5870    46      0       3.42523 6.57314 0.0792763       0.602301        0.152149
1       865545  G       A       5916    0.000488759     8.45166e-05     1       1       1       5915    1       0       1.32644 1.07745 1.14259 0.21829 0.928141
1       865584  G       A       5916    0       0       0       1       1       5916    0       0       NA      NA      NA      NA      NA
1       865628  G       A       5916    0.00782014      0.00566261      67      1       1       5849    67      0       16.4313 7.82519 0.268338        0.0357464       0.127813

Rvtests Example

Rvtests generates prefix.MetaScore.assoc, e.g. prefix.MetaScore.assoc

 ##Samples=2659
 ##AnalyzedSamples=2659
 ##Families=2659
 ##AnalyzedFamilies=2659
 ##Founders=2659
 ##AnalyzedFounders=2659
 ##InverseNormal=ON
 ##TraitSummary  min     25th    median  75th    max     mean    variance
 ##Trait -3.56023        -0.679098       -0.00304512     0.682994        3.56845 0.000710616     1.00065
 ##AnalyzedTrait -3.55632        -0.674786       0       0.674786        3.55632 -1.10229e-17    0.999884
 ##Covariates=sex,age,age2,pc1,pc2
 ##CovariateSummary      min     25th    median  75th    max     mean    variance
 ##sex   1       1       1       2       2       1.34524 0.226135
 ##age   21      55      66      73      91      63.6491 150.565
 ##age2  441     3025    4356    5329    8281    4201.72 2.25458e+06
 ##pc1   -0.1946 -0.0041 0.0014  0.0064  0.0246  -0.00027815     0.000184625
 ##pc2   -0.0957 -0.0067 -0.0005 0.0062  0.0989  0.000239526     0.000188235
 CHROM   POS     REF     ALT     N_INFORMATIVE   AF      INFORMATIVE_ALT_AC      CALL_RATE       HWE_PVALUE      N_REF   N_HET   N_ALT   U_STAT  SQRT_V_STAT     ALT_EFFSIZE     PVALUE
 1       564766  T       C       2659    0.23223 1235    1       0       2041    1       617     20.546  43.5254 0.0108453       0.636894
 1       564862  T       C       2659    NA      NA      NA      NA      NA      NA      NA      NA      NA      NA      NA
 1       565111  T       C       2659    0.0161715       86      1       4.01334e-96     2616    0       43      0.457052        13.0052 0.00270229      0.971965

Input covariance test statistic file

Covariance test statistics contains the LD matrix among a variant and the adjacent markers within a prefixed-sized window. The default window size is 1MB. It has the following format:

Format

  1. CHROM Chromosome
  2. CURRENT_POS (RAREMETALWORKER only) Position for the first marker in Window
  3. VAR_POS_IN_WIND (RAREMETALWORKER only) Position for the other markers in window, separated by commas
  4. COV_MATRICES (RAREMETALWORKER only) Covariance matrix between test statistics
  5. START_POS (rvtests only) Position for the first marker in sliding window
  6. END_POS (rvtests only) Position for the last marker in sliding window
  7. NUM_MARKER (rvtests only) Number of markers in the sliding windows
  8. MARKER_POS (rvtests only) Number of marker positions in the sliding windows
  9. COV (rvtests only) Covariance matrix between test statistics

RAREMETALWORKER Example

RAREMETALWORKER generates prefix.traitName.singlevar.cov.txt, e.g. prefix.HDL.singlevar.cov.txt

 CHR    POS        VAR_POS_IN_WINDOW                             LD_MATRIX
 1   762320     762320,865628,865665,878744,879381,1560000    0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077,
 1   865628     865628,865665,878744,879381,1560000,1864659   0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183,
 1   878744     878744,879381,1560000,1864659,1877659         0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05,

Rvtests Example

Rvtests generates prefix.MetaCov.assoc.gz, e.g. prefix.HDL.MetaCov.assoc.gz Later version of rvtests will also generate tabix index file, prefix.MetaCov.assoc.gz

 CHR  START_POS       END_POS  NUM_MARKER        MARKER_POS                             COV
 1   762320         1560000         6     762320,865628,865665,878744,879381,1560000    0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077
 1   865628         1864659         6     865628,865665,878744,879381,1560000,1864659   0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183
 1   878744         1877659         5     878744,879381,1560000,1864659,1877659         0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05

Contact

Please contact Dajiang Liu ([1]), Xiaowei Zhan ([2]), Shuang Feng([3]) or Goncalo Abecasis ([4]).