GBR60vc.conf
GBR60vc.conf
This is the configuration file for the variant calling pipeline in the GotCloud Tutorial
The configuration file contains KEY = VALUE settings that override defaults and set specific values for the given run.
CHRS = 20 BAM_INDEX = GBR60bam.index ############ # References REF_ROOT = chr20Ref # REF = $(REF_ROOT)/human_g1k_v37_chr20.fa INDEL_PREFIX = $(REF_ROOT)/1kg.pilot_release.merged.indels.sites.hg19 DBSNP_VCF = $(REF_ROOT)/dbsnp135_chr20.vcf.gz HM3_VCF = $(REF_ROOT)/hapmap_3.3.b37.sites.chr20.vcf.gz
This configuration file sets:
- CHRS - this specifies which chromosomes to process
- Leave this out of your configuration file if you want to process all chromosomes (1-22, X, Y)
- BAM_INDEX - file containing the samples & BAMs to be processed
- Reference Information:
- AS - assembly value to put in the BAM
- FA_REF - the reference file (.fa extension), the additional files should be at the same location:
- human_g1k_v37_chr20-bs.umfa
- human_g1k_v37_chr20.fa
- human_g1k_v37_chr20.fa.fai
- INDEL_PREFIX - indel information
- DBSNP_VCF - a vcf containing the dbsnp positions
- HM3_VCF - hapmap vcf
For running your own test, update the INDEX_FILE to point to your index file and the reference values to point to your references.
This example uses reference files that are chr20 only in order to speed processing of the tutorial data. If you are using the default references, you may just need to update REF_DIR to the directory where they are installed. Full Reference files can be downloaded from GotCloudReference.
It is recommended that you use absolute paths. (This example does not use absolute paths in order to be flexible to where the data is installed, but using relative paths require it to be run from the correct directory.)
GBR60bam.index
The index file contains at least 3 tab-separated columns
- Sample name
- Population
- can be a comma separated list of populations
- specify ALL if you don't know the population or if you aren't interested in population specific information
- BAM file name
If you have more than one BAM file for each sample, separate them by tabs.
HG00096 GBR bams/HG00096.bam HG00100 GBR bams/HG00100.bam HG00103 GBR bams/HG00103.bam HG00106 GBR bams/HG00106.bam HG00108 GBR bams/HG00108.bam HG00111 GBR bams/HG00111.bam HG00112 GBR bams/HG00112.bam HG00114 GBR bams/HG00114.bam HG00115 GBR bams/HG00115.bam HG00116 GBR bams/HG00116.bam HG00117 GBR bams/HG00117.bam HG00118 GBR bams/HG00118.bam HG00119 GBR bams/HG00119.bam HG00120 GBR bams/HG00120.bam HG00122 GBR bams/HG00122.bam HG00123 GBR bams/HG00123.bam HG00124 GBR bams/HG00124.bam HG00125 GBR bams/HG00125.bam HG00126 GBR bams/HG00126.bam HG00127 GBR bams/HG00127.bam HG00131 GBR bams/HG00131.bam HG00133 GBR bams/HG00133.bam HG00136 GBR bams/HG00136.bam HG00137 GBR bams/HG00137.bam HG00138 GBR bams/HG00138.bam HG00139 GBR bams/HG00139.bam HG00140 GBR bams/HG00140.bam HG00141 GBR bams/HG00141.bam HG00142 GBR bams/HG00142.bam HG00143 GBR bams/HG00143.bam HG00145 GBR bams/HG00145.bam HG00146 GBR bams/HG00146.bam HG00148 GBR bams/HG00148.bam HG00149 GBR bams/HG00149.bam HG00150 GBR bams/HG00150.bam HG00151 GBR bams/HG00151.bam HG00152 GBR bams/HG00152.bam HG00154 GBR bams/HG00154.bam HG00155 GBR bams/HG00155.bam HG00156 GBR bams/HG00156.bam HG00157 GBR bams/HG00157.bam HG00158 GBR bams/HG00158.bam HG00159 GBR bams/HG00159.bam HG00160 GBR bams/HG00160.bam HG00231 GBR bams/HG00231.bam HG00232 GBR bams/HG00232.bam HG00233 GBR bams/HG00233.bam HG00239 GBR bams/HG00239.bam HG00242 GBR bams/HG00242.bam HG00243 GBR bams/HG00243.bam HG00244 GBR bams/HG00244.bam HG00245 GBR bams/HG00245.bam HG00246 GBR bams/HG00246.bam HG00247 GBR bams/HG00247.bam HG00249 GBR bams/HG00249.bam HG00250 GBR bams/HG00250.bam HG00251 GBR bams/HG00251.bam HG00252 GBR bams/HG00252.bam HG00253 GBR bams/HG00253.bam HG00254 GBR bams/HG00254.bam
This example uses relative paths, but for greatest flexibility, absolute paths are recommended.