Changes

From Genome Analysis Wiki
Jump to navigationJump to search
6 bytes added ,  11:55, 6 July 2016
Line 92: Line 92:     
=== Future Directions ===
 
=== Future Directions ===
* Sample Filtering
+
* '''Sample Filtering'''
 
** We did not do any filtering of samples (based on dupRate, genome coverage, mapping rate, proper paired, mean depth, or any other QPLOT stats) prior to SNP and Indel calling. Because of this, we want to do this filtering now. 3,188 or 3,839 samples have genome chip data from a few years ago. For these, we could look at the non-reference concordance between the chip genotypes and the sequencing genotypes and declare 'bad' samples to be those that fall below a certain threshold, such as 98% non-ref concordance. However, since the remaining 651 samples do not have chip data, this is not an option for them. Therefore, we decided on the following strategy instead:  
 
** We did not do any filtering of samples (based on dupRate, genome coverage, mapping rate, proper paired, mean depth, or any other QPLOT stats) prior to SNP and Indel calling. Because of this, we want to do this filtering now. 3,188 or 3,839 samples have genome chip data from a few years ago. For these, we could look at the non-reference concordance between the chip genotypes and the sequencing genotypes and declare 'bad' samples to be those that fall below a certain threshold, such as 98% non-ref concordance. However, since the remaining 651 samples do not have chip data, this is not an option for them. Therefore, we decided on the following strategy instead:  
 
**# Calculate non-reference concordance for the 3,188 samples that have chip data.  
 
**# Calculate non-reference concordance for the 3,188 samples that have chip data.  
87

edits

Navigation menu