Changes

From Genome Analysis Wiki
Jump to navigationJump to search
118 bytes added ,  21:06, 26 January 2015
Line 142: Line 142:  
This section gives a brief summary of the steps required to go through an experiment of imputation on typical GWAS samples. Before pre-phasing and imputation, users must ensure that their data is quality controlled. Standard quality control filters involve excluding markers with high missingness rate, high deviations from Hardy-Weinberg equilibrium, high discordance rates (if duplicate copies available), excess Mendelian inconsistencies etc. and removing samples with high missingness rate, unusual heterozygosity, high inbreeding coefficient, clear evidence of being genetic ancestry outliers, evidence of relatedness etc. All of these steps can be easily carried out using [http://pngu.mgh.harvard.edu/~purcell/plink/plink2.shtml PLINK]. With older genotyping platforms, low frequency SNPs are also often excluded because they are hard to genotype accurately. With more modern genotyping arrays, the accuracy of genotype calls for low frequency SNPs is less of a concern.
 
This section gives a brief summary of the steps required to go through an experiment of imputation on typical GWAS samples. Before pre-phasing and imputation, users must ensure that their data is quality controlled. Standard quality control filters involve excluding markers with high missingness rate, high deviations from Hardy-Weinberg equilibrium, high discordance rates (if duplicate copies available), excess Mendelian inconsistencies etc. and removing samples with high missingness rate, unusual heterozygosity, high inbreeding coefficient, clear evidence of being genetic ancestry outliers, evidence of relatedness etc. All of these steps can be easily carried out using [http://pngu.mgh.harvard.edu/~purcell/plink/plink2.shtml PLINK]. With older genotyping platforms, low frequency SNPs are also often excluded because they are hard to genotype accurately. With more modern genotyping arrays, the accuracy of genotype calls for low frequency SNPs is less of a concern.
   −
Once a quality controlled dataset is available
+
Once a quality controlled dataset is available we need to pre-phase the data followed by imputation. The steps are explained below.
 +
 
 +
== Pre-Phasing the GWAS data ==
    
= Chromosome X Imputation =
 
= Chromosome X Imputation =
487

edits

Navigation menu