From Genome Analysis Wiki
Jump to navigationJump to search
4 bytes added
, 10:47, 22 June 2010
Line 1: |
Line 1: |
− | '''genotypeIdCheck''' is a program that verifies whether the reads in particular file match previously known genotypes for an individual (or group of individuals). | + | '''bamGenotypeCheck''' is a program that verifies whether the reads in particular file match previously known genotypes for an individual (or group of individuals). |
| | | |
| == Usage == | | == Usage == |
Line 5: |
Line 5: |
| A key step in any genetic analysis is to verify whether data being generated matches expectations. This program checks whether reads in a BAM file match previous genotypes for a specific sample. | | A key step in any genetic analysis is to verify whether data being generated matches expectations. This program checks whether reads in a BAM file match previous genotypes for a specific sample. |
| | | |
− | Using a mathematical model that relates observed sequence reads to an hypothetical true genotype, genotypeIdCheck tries to decide whether sequence reads match a particular individual or are more likely to be contaminated (including a small proportion of foreign DNA), derived from a closely related individual, or derived from a completely different individual. | + | Using a mathematical model that relates observed sequence reads to an hypothetical true genotype, bamGenotypeCheck tries to decide whether sequence reads match a particular individual or are more likely to be contaminated (including a small proportion of foreign DNA), derived from a closely related individual, or derived from a completely different individual. |
| | | |
| == Basic Usage Example == | | == Basic Usage Example == |
Line 11: |
Line 11: |
| Here is a typical command line: | | Here is a typical command line: |
| | | |
− | genotypeIDcheck -r /data/local/ref/karma.ref/human.g1k.v37.fa \ | + | bamGenotypeCheck -r /data/local/ref/karma.ref/human.g1k.v37.fa \ |
| -k BAMfiles.txt -p test.ped -d test.dat -m test.map | | -k BAMfiles.txt -p test.ped -d test.dat -m test.map |
| | | |
Line 48: |
Line 48: |
| For each aligned base that overlaps a known genotype, we calculate the probability the probability that it was derived from a particular known genotype. This comparison considers only bases that overlap previously known genotypes and that meet the base quality and mapping quality thresholds. | | For each aligned base that overlaps a known genotype, we calculate the probability the probability that it was derived from a particular known genotype. This comparison considers only bases that overlap previously known genotypes and that meet the base quality and mapping quality thresholds. |
| | | |
− | Each individual in a pedigree has a different combination of genotypes, and genotypeIdCheck will systematically search for the individual whose genotypes best match the observed read data. | + | Each individual in a pedigree has a different combination of genotypes, and bamGenotypeCheck will systematically search for the individual whose genotypes best match the observed read data. |
| | | |
| For more about the technical details, see the page [[Verifying Sample Identities - Implementation]] | | For more about the technical details, see the page [[Verifying Sample Identities - Implementation]] |
| | | |
| == TODO == | | == TODO == |