Changes

From Genome Analysis Wiki
Jump to navigationJump to search
50 bytes added ,  11:07, 15 February 2011
Line 52: Line 52:  
* The genotype file is assumed to be in SNP-major format (most binary PLINK formats are in SNP-major format by default).  
 
* The genotype file is assumed to be in SNP-major format (most binary PLINK formats are in SNP-major format by default).  
 
* IMPORTANT : For targeted sequencing data, it is important to subselect the markers to only include on-target markers in the genotype file. Off-target markers are not likely to have multiple non-duplicated reads at the marker position, and it may create artifacts in the analysis.
 
* IMPORTANT : For targeted sequencing data, it is important to subselect the markers to only include on-target markers in the genotype file. Off-target markers are not likely to have multiple non-duplicated reads at the marker position, and it may create artifacts in the analysis.
 +
* Currently, it takes only autosomal chromosomes.
    
When the genotype data of the sequenced individual is not available, sample contamination can possibly be detected by using population minor allele frequency. If the sequence reads are better explained as a mixture of multiple individuals rather than a single individual, the sample can be flagged for suspected contamination. verifyBamID takes alternative format for allele-frequency only data, which is the same format to PLINK's .bim format, with one additional column, as follows
 
When the genotype data of the sequenced individual is not available, sample contamination can possibly be detected by using population minor allele frequency. If the sequence reads are better explained as a mixture of multiple individuals rather than a single individual, the sample can be flagged for suspected contamination. verifyBamID takes alternative format for allele-frequency only data, which is the same format to PLINK's .bim format, with one additional column, as follows

Navigation menu