Difference between revisions of "RareSimu"
(Created page with "Genetic Model-based Simulator [GMS] is an efficient c++ program for simulating case control data sets based on genetic models. The input is a pool of haplotypes and a text fil...") |
|||
Line 59: | Line 59: | ||
== Download == | == Download == | ||
− | The current version is available for download from http:// | + | The current version is available for download from http://csg.sph.umich.edu//weich/GMS.tar.gz |
== TODO == | == TODO == |
Latest revision as of 11:11, 2 February 2017
Genetic Model-based Simulator [GMS] is an efficient c++ program for simulating case control data sets based on genetic models. The input is a pool of haplotypes and a text file for model specification. The output is a set of simulated datasets in the format of Merlin ped file.
Basic Usage Example
In a typical command line, a few options need to be specified together with the input files. Here is an example of how GMS works:
./GMS --hapfile test.hap --snplist test.lst --model model.heter.txt --f0 0.01 -- nrep 100 --ncase 250 --nctrl 250 --causal --prefix tmp
Command Line Options
Basic Output Options
--hapfile a pool of simulated or real haplotypes, one chromosome per row --snplist snp names in the order ofhaplotypes in hapfile, one snp per row --model a model file specifying genetic models, see below for details --nrep the number of replications --seed seed for random number generator --ncase the number of cases in each replicate --nctrl the number of controls in each replicate --f0 overall baseline prevalence --prefix prefix of output files (e.g. prefix.rep1.ped, prefix.rep2.ped) --causal only generate causal SNPs in the output pedigree file
Model File Annotation
The model file includes one header line and multiple rows after. Each row responding to a set of SNPs with desired frequency range and relate risk (RR) or odds ratio (OR)
1. Heterogeneity Model
a) COUNT FREQ_MIN FREQ_MAX RR1 RR2
b) FRACTION FREQ_MIN FREQ_MAX RR1 RR2
2. Logistic Model
a) COUNT FREQ_MIN FREQ_MAX OR1 OR2
b) FRACTION FREQ_MIN FREQ_MAX OR1 OR2
How It Works
There are two underlying models. Disease status follows a Bernoulli distribution with P
1. Heterogeneity Model
2. Logistic Model
Download
The current version is available for download from http://csg.sph.umich.edu//weich/GMS.tar.gz
TODO
1. Support Quantitative trait.
2. Support family structures.
3. Support more "reasonable" models.