GotCloud: Versions
For information on installing the releases, see: Install GotCloud Software
For information on issues/resolutions for specific versions, see: FAQ: Version Problems
Version 1.17 (Full Release on 5/14/2015)
Source can be downloaded from: https://github.com/statgen/gotcloud/releases/tag/gotcloud.1.17
General
- Add ability to run custom pipelines
- Fix bug in libVcfVcfFile.cpp
- Fix some compatibility issues for CentOS5
Aligner
- Add pipelines to run just recab & QC, and just QC.
- VerifyBamID
- Exclude ChrX & Y
SnpCall
Genotype Refinement
Indel
GenomeSTRiP
Version 1.16 (Full Release on 2/25/2015)
Source can be downloaded from: https://github.com/statgen/gotcloud/releases/tag/gotcloud.1.16
General
- Update the default REF to hs37d5.fa (build 37 with decoy) and the default DBSNP_VCF to dbsnp version 142.
- You can download an updated reference at: hs37d5-db142 (ftp://anonymous@share.sph.umich.edu/gotcloud/ref/hs37d5-db142-v1.tgz)
- Upgrade perl scripts to use
/usr/bin/env perl
instead of/usr/bin/perl
to make it compatible with more users - Upgrade to latest versions of libStatGen and bamUtil (versions 1.0.13)
- Fixes bug in calculating the MD5s for the fasta in polishBam
Aligner
- Update default aligner to
bwa mem
- you can still use
bwa aln
(the previous default) by adding the following setting to your configuration file:MAP_TYPE = BWA
- you can still use
- Upgrade to
bwa
version 0.7.12 - No longer call
verifyBamID
with the--verbose
option
SnpCall
Genotype Refinement
Indel
- Cleanup pipeline.pl to reduce errors in some versions of perl
GenomeSTRiP
Version 1.15 (Full Release on 12/16/2014)
General
- Rename BAM_INDEX to BAM_LIST
- Change default REF_DIR
- Add ref_dir and list as command-line options to all pipelines
- Add bed-diff script to compare VCFs
Aligner
- By default, create BAM_LIST
- Use SAMPLE instead of MERGE_NAME if MERGE_NAME is not specified in FASTQ_LIST
- No longer require fastqs to end in 'fastq.gz' or 'fastq'
- Rename INDEX_FILE to FASTQ_LIST and infer all fields except FASTQ1, FASTQ2, and either SAMPLE or MERGE_NAME
- Change --numcs to --numjobs and what was --numjobs to --threads
- Update to latest BWA
- Update aligner to pass \t instead of tabs for the RG fieldto new version of BWA
- By default, no longer store OQ
SnpCall
- Add validation that:
- Each BAM has only 1 sample
- BAM's sampleID matches id in BAM_LIST
- Use --ignoreSMcheck to disable this validation
- Updated Exome/Targeted settings
- Set TARGET_DIR and OFFSET_OFF_TARGET (0) in defaults
- Remove WRITE_TARGET_LOCI and base it on whether or not UNIFORM_TARGET_BED/MULTIPLE_TARGET_MAP are set and either the loci file doesn't exist, is older than the bed, or was created by a different bed
- Add validation that tabix in perl scripts succeed
- Fix some bugs in glfFlex & add region option
- Cleanup logs so they no longer spew to the screen
- Add ext-filt option for single sample filtering
- Add .OK file after vcflist file successfully created
Genotype Refinement
- Add validation that tabix in perl scripts succeed
- Add .OK file after vcflist file successfully created
Indel
- Update default region settings
- Move output directories to an "indel" folder
GenomeSTRiP
- Add a GenomeSTRiP pipeline
Version 1.14 (Full Release on 8/29/2014)
General
- Add initial beagle4 support (as a new pipeline)
- Improve input validation
- Add chromosome name consistency checks to all tools
- Upgrade version of bgzf
- Upgrade libStatGen to fix mergeBam issue.
Aligner
- Cleanup reading of fastq index/info file
- ignore empty lines (generates a warning)
- compress extra tabs/trim white space
- Validate that BWA_QUAL and BWA_THREADS settings are properly formatted
SnpCall
- Replace glfMultiples with glfFlex
- Validate format of BAM_INDEX file
- Add INDEL_VCF as an alternate for INDEL_PREFIX for input indel vcfs that aren't split by chromosome.
Genotype Refinement
- Only run beagle/thunder with more than 1 sample
Indel
- mergeBams for a single sample as its own step (didn't work before)
- Fix bug that it would fail if the list of files was too long
- Add input validation
- Validate format of BAM_INDEX file
Version 1.13 (Full Release on 7/15/2014)
General
- Cleanup runcluster
- Upgrade to bamUtil v1.0.12a
- Upgrade to libStatGen v1.0.12
- Update README to add build instructions & wiki references
Aligner
- Increment to latest VerifyBamID
Variant Calling
- Update glfMultiples to handle when first glf is empty
- Add check for the output file before creating the .OK file
- VcfPileup - improve return codes
- Write jobfiles into a sub-directory
- Added a snpcall monitoring utility
- VcfSplit - update to only append .gz in the vcflist if there was at least one file
- Write start/stop timestamps into a logfile (generated by runcluster)
Genotype Refinement
- Update beagle2Vcf.pl to use 255 for missing PL/PL3 values
- Update vcf2Beagle and beagle2Vcf to handle biallelic indels
- Still doesn't handle any multiallelic variants
- Added a ldrefine test.
Indel Calling
- Initial version of Indel Caller
- Still in testing phases, if you use, please provide feedback.
Version 1.12 (Full Release on 1/17/2014)
General
- GotCloud now works when installed in a bin/ directory.
- Add tabix source and build & bgzip build
- Add some Copyright information
- Fix printing of a failed run's return code
- Upgrade to latest libStatGen & bamUtil. See links for version details.
- Slightly newer than 1.0.10 for both - versions on 1/17/2014.
- dedup & recab now ignore Secondary reads
- mergeBam ignores PI header field when merging
- Add PhoneHome - gotCloud applies a PhoneHome thinning (BAMUTIL_THINNING) defaulted to 10 (10% of the time bamUtil does PhoneHome)
- Upgrade QPLOT to ignore secondary reads
- samtools
- Update samtools index to return an error code if it fails to build the index
Aligner
- Updgrade BWA
- BWA_MEM is now an option
- Write timestamps to Makefile log as steps start & complete
- Remove tmp files as gotCloud goes, rather than at the end.
- Deprecate RUN_QPLOT & RUN_VERIFY_BAM_ID
- Now the steps to run are specified in configuration.
- Mosaik
- Upgrade to version from Oct 29, 2013
- Add premo for pre-Mosaik processing
Variant Calling
- Update to properly handle empty VCFs
- Run make with -k option to run as much as possible after a failure.
- Update to allow steps to be dependent on BAMs (BAM_DEPEND) so they will rerun if a BAM has a newer timestamp.
- Input Validation
- Check that BAMs exist & are not empty prior to running steps that require BAMs.
- Check that filters min/maxDP are numbers, not fractions.
- GlfMultiples
- update to use DP instead of GD and fix PL description in format field header
- add region option
- samtools-hybrid
- fail on missing BGZF EOF indicator
Genotype Refinement
- Add a default number of states to Thunder
Version 1.11 (Full Release on 9/6/2013)
Aligner
- Remove an extra space from the Makefile for the dedup command.
- Brought in latest bwa source, but it is not yet being used.
Variant Calling
- Rename OUT_PREFIX to MAKE_BASE_NAME to specify the base filename for snpcall, ldrefine (beagle & thunder), & vc Makefiles. The typeOfRun.Makefile is appended to MAKE_BASE_NAME.
- These Makefiles all used to have the same name and would overwrite each other
- --makebasename/--make_basename/--make_base_name can be specified on the command-line
- Default value for MAKE_BASE_NAME is umake
- snpcall is now: $(MAKE_BASE_NAME).snpcall.Makefile (default umake.snpcall.Makefile)
- ldrefine beagle step is now: $(MAKE_BASE_NAME).beagle.Makefile (default umake.beagle.Makefile)
- ldrefine thunder step is now: $(MAKE_BASE_NAME).thunder.Makefile (default umake.thunder.Makefile)
- vc is now: $(MAKE_BASE_NAME).vc.Makefile (default umake.vc.Makefile)
- Added
gotcloud beagle
andgotcloud thunder
commands so that beagle/thunder can be called independently rather than just throughldrefine
. - Add command-line options to
gotcloud vc
for running just certain steps rather than having to set RUN...=true in the configuration- More than one --commandToRun can be specified at once
- New commands-line options:
--index
(orRUN_INDEX = true
in the configuration file)--pileup
(orRUN_PILEUP = true
in the configuration file)--glfMultiples
(orRUN_GLFMULTIPLES = true
in the configuration file)--vcfPileup
(orRUN_VCFPILEUP = true
in the configuration file)--filter
(orRUN_FILTER = true
in the configuration file)--svm
(orRUN_SVM = true
in the configuration file)--split
(orRUN_SPLIT = true
in the configuration file)
- Cleaned up the snpcall Makefile entries for pileup. It used to print targets/commands that were never executed. These unused targets have now been removed
Aligner & Variant Calling
- Remove trailing spaces from configuration values
- Add MAKE_OPTS configuration value that allows users to add Makefile options to the make calls that run the pipelines.
- Update gccalcstorage for better estimates and to have option to print estimates from a starting size rather than from actually input files
Version 1.10 (Full Release on 8/22/2013)
Aligner
- Update gccalcstorage for better align estimates
Variant Calling
- Add additional comments to umake.pl
- Update vcf-summary to print the skipped counts
- Add option to specify the REF_FAI file used by the umake (gotcloud) script for determining CHRs and their lengths.
Aligner & Variant Calling
- Only print Configuration settings to a file if the file doesn't exist
Version 1.09a (Full Release on 8/08/2013)
Aligner
- Fix relative paths
- Upgrade to newest samtools (and add source)
- Update gcrunsummary.pl - summary stats for the run.
- Upgrade to newer Mosaik
Variant Calling
- Fix minNS filter for odd number of samples. It used to give a fraction and then would be ignored.
Aligner & Variant Calling
- Cleanup phonehome script
- Cleanup gotcloud script and add ability to run perf/audria for dev purposes.
Version 1.08 (Full Release on 7/31/2013)
Aligner
- no aligner only changes
Variant Calling
- Add the ability to copy a glf to a different directory prior to running glfExtract or glfMultiples
- Remove chromosome Y from the default CHRS. Also allow CHRS to be set on the commandline via a comma separated list specified in --chrs
- Update glfMerge to skip glf files that only have a header.
- Change default FILTER_MAX_SAMPLE_DP to 1000 (from 20)
- Some SVM updates
- Added the vc option to gotcloud which uses the RUN_...settings to decide which steps to use.
Aligner & Variant Calling
- Fix bug in Conf.pm that caused a failure in some versions of perl
- Add the ability to set the GOTCLOUD_ROOT so you can test with an alternate align.pl/umake.pl script and still be able to access everything else from the standard gotcloud path.
- Cleanup the perldoc for align/snpcall
- Output all configuration settings into a file when running.
- Upgrade to most current libStatGen
- Compile as optimized
Version 1.07 (Full Release on 7/3/2013)
Aligner
- DEPRECATED configuration settings:
- 'BWA_MAX_MEM' is now 'SORT_MAX_MEM'
- 'VERIFY_BAM_ID_OPTIONS' is now 'verifyBamID_USER_PARAMS'
- ALN_TMP now defaults to $(TMP_DIR)/alignment.aln rather than $(TMP_DIR)/alignment.bwa
- Upgrade to latest QPLOT
- GC Content file has been renamed to have the extension: .winsize100.gc
- Automatically generates the bam index file if BAM_INDEX is specified
- Run DEDUP & RECAB as 1 step instead of 2
- Update dedup, recab, qplot, & verifyBamID steps to be specified via configuration
- Easier to insert steps between/before/after these
- Use PER_MERGE_STEPS to disable any of these steps (see gotcloudDefaults.conf for its default setting)
- RUN_QPLOT and RUN_VERIFY_BAM_ID are only used for validating executable/reference existence and will be deprecated completely soon
- Fixed bug where the merge failed if there was only 1 fastq pair
- Improve informational messages
- Update to BWA version 0.6.1-r104
- Bring in mergeBam updates from latest bamUtil
- ignore PG lines with duplicate ids
- General code cleanup
- Add some Mosaik support
- Added support to align.pl and a way to enable it, but the code doesn't compile
- Calculate approximate storage needed for GotCloud so user can have an idea what is coming
- Makefile now uses bash and pipefail to catch errors that occur within piped commands
- Removed the md5sum calculation
Variant Calling
- Update to always require REF
- this fixes bug that ldrefine was not checking REF or adding the optional prefix to it.
- SVM - fix bug on qual check in run_libsvm.pl
- Update defaults for filtering
- Fixed bug in libVcf/VcfFile that had FamID instead of FatID
- Fixed bug in samtools-hybrid that caused it to fail when checking for BAI files if bam was elsewhere in the filename
- Fix vcfPielup to accept .bam.bai or .bai in bam index filenames.
- Fix the split logic to work if a VCF file had no PASS records
Aligner & Variant Calling
- Add checks for required executables prior to running
- Limit the number of jobs that can run locally (there is a flag to override this)
- Extract configuration routines from the 2 .pl's to a common Conf.pm
- Add FLUX support
- 1st attempt at checking for new versions
- Doesn't quite always work yet, but shouldn't cause a problem
Version 1.06 (Full Release on 4/17/2013)
Variant Calling
- Update to always require REF
- this fixes bug that ldrefine was not checking REF or adding the optional prefix to it.
Version 1.05 (Full Release on 4/17/2013)
Aligner & Variant Calling
- Cleanup handling of BASE_PREFIX & added REF_PREFIX.
- Allows user to specify --base_prefix or --baseprefix on command-line
- Now used for index files & reference files in addition to fastqs (aligner) and bams (variant calling)
Version 1.04 (Full Release on 4/16/2013)
Aligner & Variant Calling
- Update relative paths to be relative to the current working directory
- Aligner effects:
- INDEX_FILE as specified in the aligner configuration
- fastq paths specified in the INDEX_FILE
- Variant Calling effects:
- BAM_INDEX as specified in the configuration
- bam paths specified in the BAM_INDEX
- Aligner effects:
- Add getAbsPath() method for determining the absolute path with the additional capability of prepending an optional PREFIX (as specified in configuration) to the directory:
- BASE_PREFIX
- FASTQ_PREFIX (for aligner reading the fastq index file)
- renamed from FASTQ/FASTQ_REF
- BAM_PREFIX (for variant calling reading bam index file)
- Improve Error detection
- With --test option, check that the testdir exists before running the test
Cluster Support
- Update the mosix option to run mosbatch instead of mosrun
- Only attempt to "fix" the CWD for mosix/mosbatch
- Remove the warning if this "fix" fails
- This "fix" is specific for running at UM, but should not cause a failure when running elsewhere
Includes all updates from previous Internal Only Releases.
Version 1.03a6 (Internal Only Release on 4/10/2013)
- Cleanup the cluster support code
- Also add support for fixing the problem with UMich directories when using Mosix
- Update the default Reference directory to be as expected for UM
- Variant Calling changes:
- SVM
- Add option to merge all chromosome sites prior to running SVM (to better support targeted sequencing)
- Cleanup some of the Makefile dependencies to depend on files rather than phony targets (this prevents it from always rerunning those steps)
- SVM
Version 1.03a5 (Internal Only Release on 4/5/2013)
- Add pre-checks for required files & reference files prior to running
- Add checks for deprecated configuration settings
- Merge aligner & variant calling default configurations into a single file (bin/gotcloudDefaults.conf)
- Aligner
- Update to put actual values into the Makefile recipes rather than using variables
- Variant Calling
- Fix vcf-summary to handle chromosomes that have string names (like X,Y)
Version 1.03a4 (Internal Only Release on 4/2/2013)
- Variant Calling:
- Update to by default run as local
- Target Loci file updates:
- When WRITE_TARGET_LOCI is set to true: only generate the .loci file if the specified bed is newer than the loci file
- When WRITE_TARGET_LOCI is set to ALWAYS, generate the .loci file regardless of the timestamps
- Only create the glf index file for a region if it does not exist or is older than the bam index file
Version 1.03a3 (Internal Only Release on 3/29/2013)
- Attempted to Fix bug that it wasn't properly running batching
- This version was not good (fixed in 1.034a.
Version 1.03a2 (Internal Only Release on 3/27/2013)
- Add the qplot source code
Version 1.03a1 (Internal Only Release on 3/26/2013)
- Variant Calling
- Add FILTER_MIN_NS to add the option of filtering based on the number of samples
- Add FILTER_ADDITIONAL to add the option of adding additional filters.
Version 1.03a (Full Release on 3/22/2013)
- Cleanup README & INSTALL instructions
- Variant Calling
- Fix dependency bug/error in SVM
- Fix commands that run locally to check for pipe failures
- Improve file open error detection in SVM logic
- Add option to obtain the version number
Version 1.03 (Full Release on 3/15/2013)
- Add SVM Filtering
- there was a bug in this, please do not use this version.
- Version 1.03a fixes this bug.
Version 1.02 (Full Release on 3/13/2013)
- Cleanup cluster scripts
- Rename alinger to align.pl & umake to snp
- Add VerifyBamID source
- MANY Updates, please use a newer version.