Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 13: Line 13:     
There are a number of useful commands related to the analysis that are typically set early in the analysis.  For example, the user can choose to weight studies in the meta-analysis using the inverse of the standard error, or the square root of the sample size.  These are proportionate.  Users should be cautious when weighting based on standard error that the beta and standard error are in the same units for all studies (i.e. same trait and same transformation applied to the trait).  The default weighting scheme is SAMPLESIZE.
 
There are a number of useful commands related to the analysis that are typically set early in the analysis.  For example, the user can choose to weight studies in the meta-analysis using the inverse of the standard error, or the square root of the sample size.  These are proportionate.  Users should be cautious when weighting based on standard error that the beta and standard error are in the same units for all studies (i.e. same trait and same transformation applied to the trait).  The default weighting scheme is SAMPLESIZE.
 +
    
SCHEME STDERR
 
SCHEME STDERR
 +
    
METAL has an option to perform genomic control correction to all input files.  METAL will estimate the inflation of the test statistic by comparing the median test statistic to that expected by chance, and then apply the genomic control correction to the p-values (for SAMPLESIZE weighted meta-analysis), or the standard error (for STDERR weighted meta-analysis).  This should only be applied to files with whole genome data (i.e. should not be used for cohorts that only performed genotyping of replication SNPs).  Genomic control can be turned off and on for different input files.  We recommend applying genomic control correction to all input files, and also to the final output by loading the initial results file into METAL to perform genomic control correction on the final results.
 
METAL has an option to perform genomic control correction to all input files.  METAL will estimate the inflation of the test statistic by comparing the median test statistic to that expected by chance, and then apply the genomic control correction to the p-values (for SAMPLESIZE weighted meta-analysis), or the standard error (for STDERR weighted meta-analysis).  This should only be applied to files with whole genome data (i.e. should not be used for cohorts that only performed genotyping of replication SNPs).  Genomic control can be turned off and on for different input files.  We recommend applying genomic control correction to all input files, and also to the final output by loading the initial results file into METAL to perform genomic control correction on the final results.
   −
'''GENOMICCONTROL ON'''
+
 
 +
GENOMICCONTROL ON
 +
 
    
METAL will optionally keep track of the effect allele frequency across all files and provide the mean, minimum and maximum.  This can be quite useful to determine whether the frequencies are similar across different cohorts after METAL performs all strand alignment.  METAL requires all input files to have an allele frequency column when this feature is turned on.
 
METAL will optionally keep track of the effect allele frequency across all files and provide the mean, minimum and maximum.  This can be quite useful to determine whether the frequencies are similar across different cohorts after METAL performs all strand alignment.  METAL requires all input files to have an allele frequency column when this feature is turned on.
   −
'''AVERAGEFREQ ON<br>'''
+
 
'''MINMAXFREQ ON'''
+
AVERAGEFREQ ON<br>
 +
MINMAXFREQ ON
 +
 
    
Then, for each individual file, the following command will be used;
 
Then, for each individual file, the following command will be used;
   
FREQLABEL EffectAlleleFrequencyColumnHeading
 
FREQLABEL EffectAlleleFrequencyColumnHeading
    
We allow users to keep cumulative counts of custom variables across input files.  An example of this might be to keep track of the sample size when performing standard-error weighted meta-analysis.  The name of the custom variable should be defined once, before input files are loaded.  The name of the heading in each file can be specified using the command LABEL for each file.  
 
We allow users to keep cumulative counts of custom variables across input files.  An example of this might be to keep track of the sample size when performing standard-error weighted meta-analysis.  The name of the custom variable should be defined once, before input files are loaded.  The name of the heading in each file can be specified using the command LABEL for each file.  
 +
    
CUSTOMVARIABLE TotalSampleSize
 
CUSTOMVARIABLE TotalSampleSize
   
For each individual input file;
 
For each individual input file;
 +
LABEL TotalSampleSize as N
   −
LABEL TotalSampleSize as N
      
We allow flexible input formats, including a method for providing SNPs on different strands.  Input files can contain a column which can indicate which strand the alleles are coded on (given as +/-).  This feature can be turned on and off for different files in the same analysis.  If USESTRAND is off, the strand is assumed to be “+” for all SNPs, although obvious strand problems for unambiguous SNPs are identified by METAL and appropriately handled (i.e. one study provides A/G alleles and a different study provides C/T alleles)
 
We allow flexible input formats, including a method for providing SNPs on different strands.  Input files can contain a column which can indicate which strand the alleles are coded on (given as +/-).  This feature can be turned on and off for different files in the same analysis.  If USESTRAND is off, the strand is assumed to be “+” for all SNPs, although obvious strand problems for unambiguous SNPs are identified by METAL and appropriately handled (i.e. one study provides A/G alleles and a different study provides C/T alleles)
 +
    
USESTRAND ON
 
USESTRAND ON
   
For each individual file;
 
For each individual file;
 +
STRAND StrandColumnHeading
   −
STRAND StrandColumnHeading
      
METAL allows for complete output of individual summary statistics for all SNPs in all input files.  This can create a very large file and should be used with caution.  Users should create custom variables to restrict analyses to significant SNPs or specific SNPs of interest before using this option.  However, this option can be useful for comparing direction of effect across many studies since METAL takes care of all the strand flipping and provides the direction of effect relative to the same allele.  This is also a way to double-check that the expected data are being used appropriately by METAL.
 
METAL allows for complete output of individual summary statistics for all SNPs in all input files.  This can create a very large file and should be used with caution.  Users should create custom variables to restrict analyses to significant SNPs or specific SNPs of interest before using this option.  However, this option can be useful for comparing direction of effect across many studies since METAL takes care of all the strand flipping and provides the direction of effect relative to the same allele.  This is also a way to double-check that the expected data are being used appropriately by METAL.
 +
 +
 
VERBOSE ON
 
VERBOSE ON
 +
    
Another option allows METAL to check the appropriate number of columns exist for each input file, or allows METAL to ignore situations when there are not enough columns.  The default is STRICT column counting.
 
Another option allows METAL to check the appropriate number of columns exist for each input file, or allows METAL to ignore situations when there are not enough columns.  The default is STRICT column counting.
 +
 +
 
COLUMNCOUNTING LENIENT
 
COLUMNCOUNTING LENIENT
   Line 60: Line 70:     
Tables must have column headers that specify where the mandatory input can be found.  The default name for the Marker column is ‘MARKER’, but can be changed to match the relevant input file column with the following command;
 
Tables must have column headers that specify where the mandatory input can be found.  The default name for the Marker column is ‘MARKER’, but can be changed to match the relevant input file column with the following command;
 +
    
MARKER SNP
 
MARKER SNP
 +
    
Similarly, the reference allele column, P-value column and effect column can be changed to match the input file;
 
Similarly, the reference allele column, P-value column and effect column can be changed to match the input file;
 +
    
ALLELE RefAlleleColumnHeading NonRefAlleleColumnHeading
 
ALLELE RefAlleleColumnHeading NonRefAlleleColumnHeading
 
PVALUE PvalueColumnHeading
 
PVALUE PvalueColumnHeading
 
EFFECT EffectColumnHeading
 
EFFECT EffectColumnHeading
 +
    
We strongly recommend that both allele labels, corresponding to the the effect allele and non-effect allele, respectively, are given for all SNPs.  Alleles can be numeric (1,2,3,4) or alphabetical (A,C,G,T,a,c,g,t) and can be on either strand if not an A/T or C/G SNP.  For A/T or C/G SNPs, METAL requires SNPs to be on a consistent strand in different input files for the results to be interpretable.  For A/C, A/G, C/T, and G/T SNPs, METAL will flip the strand the alleles are on if not consistent between input files and METAL will output results with respect to the lowest numeric reference allele (see Examples 1, 2, and 3, below).  If all files are consistent (for example, using the HapMap allele naming conventions), the strand of the alleles is left alone. As long as both allele columns are given for each input file, METAL appropriately accounts for situations when different input files use different reference alleles.
 
We strongly recommend that both allele labels, corresponding to the the effect allele and non-effect allele, respectively, are given for all SNPs.  Alleles can be numeric (1,2,3,4) or alphabetical (A,C,G,T,a,c,g,t) and can be on either strand if not an A/T or C/G SNP.  For A/T or C/G SNPs, METAL requires SNPs to be on a consistent strand in different input files for the results to be interpretable.  For A/C, A/G, C/T, and G/T SNPs, METAL will flip the strand the alleles are on if not consistent between input files and METAL will output results with respect to the lowest numeric reference allele (see Examples 1, 2, and 3, below).  If all files are consistent (for example, using the HapMap allele naming conventions), the strand of the alleles is left alone. As long as both allele columns are given for each input file, METAL appropriately accounts for situations when different input files use different reference alleles.
Line 77: Line 91:  
To perform odds-ratio based meta-analysis, select SCHEME STDERR at the beginning of the script.  Then, for each file, provide the natural log of the odds ratio as the EFFECT column;
 
To perform odds-ratio based meta-analysis, select SCHEME STDERR at the beginning of the script.  Then, for each file, provide the natural log of the odds ratio as the EFFECT column;
 
EFFECT logOddsRatioColumnHeading
 
EFFECT logOddsRatioColumnHeading
 +
    
Or, METAL can compute the log of the odds ratio for you;
 
Or, METAL can compute the log of the odds ratio for you;
 
EFFECT log(OddsRatioColumnHeading)
 
EFFECT log(OddsRatioColumnHeading)
 +
    
The weight for each MARKER can be assigned using a column;
 
The weight for each MARKER can be assigned using a column;
 
WEIGHTLABEL SampleSizeColumnHeading
 
WEIGHTLABEL SampleSizeColumnHeading
 +
    
Or;
 
Or;
 
WEIGHT SampleSizeColumnHeading
 
WEIGHT SampleSizeColumnHeading
 +
    
Or the default weight for the entire file can be specified with the following command;
 
Or the default weight for the entire file can be specified with the following command;
 
E.g., if you have a sample size of 2000 for all markers in an input file
 
E.g., if you have a sample size of 2000 for all markers in an input file
 
DEFAULTWEIGHT 2000
 
DEFAULTWEIGHT 2000
 +
    
The default delimiter in METAL is WHITESPACE (comma or tab is considered a delimiter) but can be changed to comma, tab or space.
 
The default delimiter in METAL is WHITESPACE (comma or tab is considered a delimiter) but can be changed to comma, tab or space.
 +
 
   
 
   
 
SEPARATOR commas
 
SEPARATOR commas
 +
    
Custom-designed filters can be used to select SNPs for inclusion in the meta-analysis.  This can be used to select SNPs above or below a certain value (> or < ) from any column in the table, which can be useful for including SNPs with a minor allele frequency above a certain threshold.
 
Custom-designed filters can be used to select SNPs for inclusion in the meta-analysis.  This can be used to select SNPs above or below a certain value (> or < ) from any column in the table, which can be useful for including SNPs with a minor allele frequency above a certain threshold.
Line 102: Line 123:     
To remove filters so that they no longer apply to files processed later, use;
 
To remove filters so that they no longer apply to files processed later, use;
 +
 +
 
REMOVEFILTERS
 
REMOVEFILTERS
   Line 108: Line 131:     
METAL does not require that all input files have a p-value result to calculate a meta-analysis p-value.  Any available data is used.  To restrict the output to only markers that have at least a specific weight (number of individuals), then use;
 
METAL does not require that all input files have a p-value result to calculate a meta-analysis p-value.  Any available data is used.  To restrict the output to only markers that have at least a specific weight (number of individuals), then use;
 +
 +
 
> MINWEIGHT 10000
 
> MINWEIGHT 10000
 +
 +
 
For example to restrict the output to show only Markers with at least 10,000 individuals.  
 
For example to restrict the output to show only Markers with at least 10,000 individuals.  
  
28

edits

Navigation menu