From Genome Analysis Wiki
Jump to: navigation, search


473 bytes added, 12:52, 26 May 2015
no edit summary
=== Getting Help with GotCloud ===
The fastq files are processed using the [[GotCloud: Alignment Pipeline|alignment pipeline]] which finds the most likely genomic location for each read and stores that information in a [[BAM|BAM (Binary Sequence Alignment/Map format) file]]. In addition to the sequence and base quality information contained in FASTQ files, a BAM file also contains the genomic location and some additional information about the mapping. As part of the [[GotCloud: Alignment Pipeline|alignment pipeline]], the base qualities are adjusted to more accurately reflect the likelihood that the base is correct.
The [[GotCloud: Alignment Pipeline|alignment pipeline]] can be skipped if you already have Deduped and Recalibrated BAM files. If you have BAMs, but they needed to be deduped and recalibrated, you can use our [[GotCloud:_Alignment_Sub-Pipelines#recabQC_2|recabQC pipeline]].
The [[GotCloud: Variant Calling Pipeline|variant calling pipeline]] processes the deduped and recalibrated BAM files produced by the alignment pipeline or that you provide it, generating an initial list of polymorphic sites and genotypes stored in a [ VCF (Variant Call Format) file]. The [[GotCloud: Variant Calling Pipeline|variant calling pipeline]] then filters the variants using both hard filters and a [[SVM Filtering|Support Vector Machine (SVM)]]. It then uses haplotype information to refine these genotypes in an updated VCF file.
== Publication ==
If you use GotCloud, please cite our publication:
[ Jun, Goo, et al. "An efficient and scalable analysis framework for variant extraction and refinement from population scale DNA sequence data." Genome research (2015): gr-176552.]
== GotCloud Setup ==

Navigation menu