Changes

Verifying Sample Identities - Implementation (view source)

Revision as of 15:52, 13 April 2010

233 bytes added , 15:52, 13 April 2010

Line 7: Line 7:

For each sample, we would like to calculate the likelihood of a set of reads assuming that we sequenced the correct sample, assuming we sequenced a sample related to the correct sample, or assuming we sequenced an incorrect sample. We would then like to flag samples where it appears likely that the wrong sample has been sequenced.

−

If we have a list of bases that overlap a known genotype, we can ~~calculate~~ the probability of a ~~match or mismatch at each~~ base as:

+

If we have a list of bases that overlap a known genotype, we can will

+

describe the probability of a matching of mismatching base using the

+

following notation:

{| width="100%" cellspacing="1" cellpadding="1" border="1" summary="Summary of Variables Used Below"

Line 15: Line 17:

| Definition

|-

−

| A/A

+

| <span class="texhtml">''A/A''</span>

| Previously known genotype; we only consider homozygous sites.

|-

Line 27: Line 29:

| Estimate error rate for the current base in the sequence data.

|}

+

Then, the probabilities of interest are:

+

<math>

+

P(match) = P_{ibd} (1 - \epsilon) + (1 - P_{ibd}) \epsilon

+

P(no match) = P_{ibd} \epsilon + (1 - P_{ibd}) \epsilon

+

</math>

Pha

75

edits

Changes

Verifying Sample Identities - Implementation (view source)

Revision as of 15:52, 13 April 2010

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools