From Genome Analysis Wiki
Jump to navigationJump to search
473 bytes added
, 12:52, 26 May 2015
Line 25: |
Line 25: |
| | | |
| [[File:Gotcloud.puzzles.v2.png|500px]] | | [[File:Gotcloud.puzzles.v2.png|500px]] |
| + | |
| | | |
| === Getting Help with GotCloud === | | === Getting Help with GotCloud === |
Line 43: |
Line 44: |
| The fastq files are processed using the [[GotCloud: Alignment Pipeline|alignment pipeline]] which finds the most likely genomic location for each read and stores that information in a [[BAM|BAM (Binary Sequence Alignment/Map format) file]]. In addition to the sequence and base quality information contained in FASTQ files, a BAM file also contains the genomic location and some additional information about the mapping. As part of the [[GotCloud: Alignment Pipeline|alignment pipeline]], the base qualities are adjusted to more accurately reflect the likelihood that the base is correct. | | The fastq files are processed using the [[GotCloud: Alignment Pipeline|alignment pipeline]] which finds the most likely genomic location for each read and stores that information in a [[BAM|BAM (Binary Sequence Alignment/Map format) file]]. In addition to the sequence and base quality information contained in FASTQ files, a BAM file also contains the genomic location and some additional information about the mapping. As part of the [[GotCloud: Alignment Pipeline|alignment pipeline]], the base qualities are adjusted to more accurately reflect the likelihood that the base is correct. |
| | | |
− | The [[GotCloud: Alignment Pipeline|alignment pipeline]] can be skipped if you already have Deduped and Recalibrated BAM files. | + | The [[GotCloud: Alignment Pipeline|alignment pipeline]] can be skipped if you already have Deduped and Recalibrated BAM files. If you have BAMs, but they needed to be deduped and recalibrated, you can use our [[GotCloud:_Alignment_Sub-Pipelines#recabQC_2|recabQC pipeline]]. |
| | | |
| The [[GotCloud: Variant Calling Pipeline|variant calling pipeline]] processes the deduped and recalibrated BAM files produced by the alignment pipeline or that you provide it, generating an initial list of polymorphic sites and genotypes stored in a [http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41 VCF (Variant Call Format) file]. The [[GotCloud: Variant Calling Pipeline|variant calling pipeline]] then filters the variants using both hard filters and a [[SVM Filtering|Support Vector Machine (SVM)]]. It then uses haplotype information to refine these genotypes in an updated VCF file. | | The [[GotCloud: Variant Calling Pipeline|variant calling pipeline]] processes the deduped and recalibrated BAM files produced by the alignment pipeline or that you provide it, generating an initial list of polymorphic sites and genotypes stored in a [http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41 VCF (Variant Call Format) file]. The [[GotCloud: Variant Calling Pipeline|variant calling pipeline]] then filters the variants using both hard filters and a [[SVM Filtering|Support Vector Machine (SVM)]]. It then uses haplotype information to refine these genotypes in an updated VCF file. |
Line 50: |
Line 51: |
| | | |
| [[File:GotCloudDiagram.jpg|500px]] | | [[File:GotCloudDiagram.jpg|500px]] |
| + | |
| + | |
| + | == Publication == |
| + | If you use GotCloud, please cite our publication: |
| + | [http://genome.cshlp.org/content/early/2015/04/14/gr.176552.114.abstract Jun, Goo, et al. "An efficient and scalable analysis framework for variant extraction and refinement from population scale DNA sequence data." Genome research (2015): gr-176552.] |
| | | |
| == GotCloud Setup == | | == GotCloud Setup == |