From Genome Analysis Wiki
Jump to navigationJump to search
255 bytes added
, 01:02, 15 June 2014
Line 40: |
Line 40: |
| # Reference genome FASTA file | | # Reference genome FASTA file |
| #* Contains the reference base for each position of each chromosome | | #* Contains the reference base for each position of each chromosome |
| + | #** Used to compare bases in sequence reads to the reference positions they mapped to |
| + | #** Identify SNPs |
| #* Additional information on the FASTA format: http://en.wikipedia.org/wiki/FASTA_format | | #* Additional information on the FASTA format: http://en.wikipedia.org/wiki/FASTA_format |
| # VCF (variant call format) files with chromosomes/positions | | # VCF (variant call format) files with chromosomes/positions |
− | #* dbsnp - used to skip known variants when recalibrating | + | #* indel - contains known insertions & deletions to help with filtering |
− | #* hapmap - used for sample contamination/sample swap validation | + | #* omni - used as likely true positives for SVM filtering |
− | #* | + | #* hapmap - used as likely true positives for SVM filtering and for generating summary statistics |
− | | + | #* dbsnp - used for generating summary statistics |
| | | |
| === GotCloud Configuration File === | | === GotCloud Configuration File === |