Changes

Analyses of Indels (view source)

Revision as of 16:40, 20 February 2014

1,495 bytes added , 16:40, 20 February 2014

→‎Insertion/Deletion ratios, Coding Regions and Overlap analysis

Line 196: Line 196:

==Insertion/Deletion ratios, Coding Regions and Overlap analysis==

−

~~The proportion~~ of ~~frameshift Indels amongst~~ coding region indels ~~is a potential indicator of quality.~~

+

You can obtain measure of insertion deletion ratios, coding region indels and sensitivity analysis by using the profile_indels analysis.

−

~~You can obtain it~~ by using the profile_indels analysis.

vt profile_indels -g indel.reference.txt -r ~/ref/vt/grch37/hs37d5.fa mills.normalized.sites.bcf

Line 231: Line 229:

Precision 0.9%

Sensitivity 0.2%

+

Ins/Del ratios: Reference alignment based methods tend to be biased towards the detection of deletions. This provides a useful measure for discovery Indel sets to show the varying degree of biasness.

+

Coding region analysis: Coding region Indels may be categorised as Frame shift Indels and Non frameshift Indels. A lower proportion of Frameshift Indels may indicate a better quality data set but this depends also on the individuals sequenced.

+

Overlap analysis: overlap analysis with other data sets is an indicator of sensitivity.

+

dbsnp: contains Indels submitted from everywhere, I am not sure what does this represent exactly.

+

Mills: contains doublehit common indels from the Mills. et al paper and is a relatively good measure of sensitivity for common variants. Because not all Indels in this set is expected to be present in your sample, this actually gives you an underestimate of sensitivity.

+

Mills chip: This is a subset of the Mills data set. There are genotypes here that are useful for subsetting polymophic subsets of variants that are present in samples common with your data set, this can potentially provide a better estimate of sensitivity. In general not very useful unless you happen to be working on 1000 Genomes data or any data set who's individuals are commonly studied.

+

Affy Exome Chip: This contains somewhat rare variants in exonic regions and is useful for exome chip analysis. You should subset your exome data to exome region Indels before comparing against this data set.

==STR ==

Atks

1,102

edits

Changes

Analyses of Indels (view source)

Revision as of 16:40, 20 February 2014

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools