From Genome Analysis Wiki
Jump to navigationJump to search
1,532 bytes added
, 11:44, 21 February 2013
Line 47: |
Line 47: |
| Commands finished in nn secs with no errors reported | | Commands finished in nn secs with no errors reported |
| | | |
− | The final BAM files produced by the mapping pipeline can be found in the files: | + | The final BAM files produced by the mapping pipeline are: |
| ls mappingResults/alignment.recal/*.recal.bam | | ls mappingResults/alignment.recal/*.recal.bam |
| | | |
− | Index files (.bai) for these BAMs are also in that directory. | + | Index files (.bai) for these BAMs are also in that directory. |
| | | |
| The QC files for verifyBamID are: | | The QC files for verifyBamID are: |
Line 62: |
Line 62: |
| [[Understanding QPLOT output]] | | [[Understanding QPLOT output]] |
| | | |
| + | ==Generating Variant Calls== |
| + | The next step is to analyze BAM files by calling SNPs and generating a VCF file containing the results. |
| + | |
| + | The variant calling pipeline has multiple built-in steps to generate BAMs: |
| + | # Filter out reads with low mapping quality |
| + | # Per Base Alignment Quality Adjustment (BAQ) |
| + | # Resolve overlapping paired end reads |
| + | # Generate genotype likelihood files |
| + | # Perform variant calling |
| + | # Extract features from variant sites |
| + | # Perform variant filtering |
| + | |
| + | This processing results in a single set of variant sites for all samples. |
| + | |
| + | Run the variant calling pipeline: |
| + | umake.pl --conf [[GBR60vc.conf]] --outdir vcResults --snpcall --numjobs 2 |
| + | |
| + | TBD - maybe merge both mapping & umake into a single script and have them as options. |
| + | |
| + | TBD - add link explaining the contents of the .conf & .index files. |
| + | |
| + | Upon successful completion of the variant calling pipeline, you will see the following message: |
| + | TBD |
| + | |
| + | The final VCF produced by the variant calling pipeline containing only the variants that passed all filters is: |
| + | ls vcResults/split/chr20/chr20.filtered.PASS.vcf.gz |
| + | |
| + | The VCF including the filtered sites with the filters marked in the Filter field (or "PASS" if the site was not filtered) is: |
| + | ls vcResults/vcf/chr20/chr20.filtered.vcf.gz |
| + | |
| + | TBD what sort of post analysis is necessary??? |
| + | |
| + | ==Linkage Disequilibrium-Aware Geontype Refinement== |
| + | |
| + | Instructions for running Beagle/Thunder. |
| + | |
| + | Maybe these should be built into |
| + | |
| + | |
| + | |
| + | = Modifying the Tutorial Inputs to Run Your Own Data = |
| + | |
| + | == Mapping Pipeline == |
| + | The inputs to the mapping pipeline are |
| | | |
| ===Index file=== | | ===Index file=== |