From Genome Analysis Wiki
Jump to navigationJump to search
1,101 bytes removed
, 10:22, 26 October 2016
Line 92: |
Line 92: |
| | | |
| http://genome.sph.umich.edu/wiki/Triodenovo | | http://genome.sph.umich.edu/wiki/Triodenovo |
− |
| |
− | 3. Further thoughts about filtering for SNVs without bam files (step 2 requires bam files). There is no consensus on filtering so this can be very flexible.
| |
− | * If you have a multi-sample call VCF it may be helpful to select those mutation candidates that appear only once in your VCF (AC=1 for example). This can be the top tier to consider. Relaxing AC to 2 or 3 can recover more real mutations but also increase false positives.
| |
− | * If it is too stringent to filter out known sites, it may be helpful to select candidates that have low (e.g. <0.002)1000G or ESP allele frequencies. Some mutations can occur on know variant sites but mutations with high population frequencies may not be of great interest, if indeed they are real.
| |
− | * Candidates in segmental duplications, low complexity regions or other copy number regions may be flagged for further analysis.
| |
− | * Candidates for which parents are not hom-ref or offspring is a double mutant are more likely to be due to artifacts so the interpretation of these candidates may require additional QC if they appear to be interesting to the investigators.
| |
| | | |
| == Download == | | == Download == |