Difference between revisions of "GotCloud: Versions"

From Genome Analysis Wiki
Jump to navigationJump to search
 
(21 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
__TOC__
 +
 +
For information on installing the releases, see: [[GotCloud#Install_GotCloud_Software|Install GotCloud Software]]
 +
 +
For information on issues/resolutions for specific versions, see: [[GotCloud:_FAQs#Version_Problems|FAQ: Version Problems]]
 +
 +
== Version 1.17 (Full Release on 5/14/2015) ==
 +
Source can be downloaded from: https://github.com/statgen/gotcloud/releases/tag/gotcloud.1.17
 +
 +
'''General'''
 +
* Add ability to run custom pipelines
 +
* Fix bug in libVcfVcfFile.cpp
 +
* Fix some compatibility issues for CentOS5
 +
 +
'''Aligner'''
 +
* Add pipelines to run just recab & QC, and just QC.
 +
* VerifyBamID
 +
** Exclude ChrX & Y
 +
 +
'''SnpCall'''
 +
 +
'''Genotype Refinement'''
 +
 +
'''Indel'''
 +
 +
'''GenomeSTRiP'''
 +
 +
 +
== Version 1.16 (Full Release on 2/25/2015) ==
 +
Source can be downloaded from: https://github.com/statgen/gotcloud/releases/tag/gotcloud.1.16
 +
 +
'''General'''
 +
* Update the default REF to hs37d5.fa (build 37 with decoy) and the default DBSNP_VCF to dbsnp version 142.
 +
** You can download an updated reference at: [[GotCloud:_Genetic_Reference_and_Resource_Files#hs37d5-db142|hs37d5-db142]] (ftp://anonymous@share.sph.umich.edu/gotcloud/ref/hs37d5-db142-v1.tgz)
 +
* Upgrade perl scripts to use <code>/usr/bin/env perl</code> instead of <code>/usr/bin/perl</code> to make it compatible with more users
 +
* Upgrade to latest versions of libStatGen and bamUtil (versions 1.0.13)
 +
** Fixes bug in calculating the MD5s for the fasta in polishBam
 +
 +
'''Aligner'''
 +
* Update default aligner to <code>bwa mem</code>
 +
** you can still use <code>bwa aln</code> (the previous default) by adding the following setting to your configuration file:
 +
**:<code>MAP_TYPE = BWA</code>
 +
* Upgrade to <code>bwa</code> version 0.7.12
 +
* No longer call <code>verifyBamID</code> with the <code>--verbose</code> option
 +
 +
 +
'''SnpCall'''
 +
 +
'''Genotype Refinement'''
 +
 +
'''Indel'''
 +
* Cleanup pipeline.pl to reduce errors in some versions of perl
 +
 +
'''GenomeSTRiP'''
 +
 +
== Version 1.15 (Full Release on 12/16/2014) ==
 +
 +
'''General'''
 +
* Rename BAM_INDEX to BAM_LIST
 +
* Change default REF_DIR
 +
* Add ref_dir and list as command-line options to all pipelines
 +
* Add bed-diff script to compare VCFs
 +
 +
'''Aligner'''
 +
* By default, create BAM_LIST
 +
* Use SAMPLE instead of MERGE_NAME if MERGE_NAME is not specified in FASTQ_LIST
 +
* No longer require fastqs to end in 'fastq.gz' or 'fastq'
 +
* Rename INDEX_FILE to FASTQ_LIST and infer all fields except FASTQ1, FASTQ2, and either SAMPLE or MERGE_NAME
 +
* Change --numcs to --numjobs and what was --numjobs to --threads
 +
* Update to latest BWA
 +
** Update aligner to pass \t instead of tabs for the RG fieldto new version of BWA
 +
* By default, no longer store OQ
 +
 +
'''SnpCall'''
 +
* Add validation that:
 +
** Each BAM has only 1 sample
 +
** BAM's sampleID matches id in BAM_LIST
 +
*** Use --ignoreSMcheck to disable this validation
 +
* Updated Exome/Targeted settings
 +
** Set TARGET_DIR and OFFSET_OFF_TARGET (0) in defaults
 +
** Remove WRITE_TARGET_LOCI and base it on whether or not UNIFORM_TARGET_BED/MULTIPLE_TARGET_MAP are set and either the loci file doesn't exist, is older than the bed, or was created by a different bed
 +
* Add validation that tabix in perl scripts succeed
 +
* Fix some bugs in glfFlex & add region option
 +
* Cleanup logs so they no longer spew to the screen
 +
* Add ext-filt option for single sample filtering
 +
* Add .OK file after vcflist file successfully created
 +
 +
'''Genotype Refinement'''
 +
* Add validation that tabix in perl scripts succeed
 +
* Add .OK file after vcflist file successfully created
 +
 +
'''Indel'''
 +
* Update default region settings
 +
* Move output directories to an "indel" folder
 +
 +
'''GenomeSTRiP'''
 +
* Add a GenomeSTRiP pipeline
 +
 +
 +
== Version 1.14 (Full Release on 8/29/2014) ==
 +
 +
'''General'''
 +
* Add initial beagle4 support (as a new pipeline)
 +
* Improve input validation
 +
** Add chromosome name consistency checks to all tools
 +
* Upgrade version of bgzf
 +
* Upgrade libStatGen to fix mergeBam issue.
 +
 +
'''Aligner'''
 +
* Cleanup reading of fastq index/info file
 +
** ignore empty lines (generates a warning)
 +
** compress extra tabs/trim white space
 +
* Validate that BWA_QUAL and BWA_THREADS settings are properly formatted
 +
 +
'''SnpCall'''
 +
* Replace glfMultiples with glfFlex
 +
* Validate format of BAM_INDEX file
 +
* Add INDEL_VCF as an alternate for INDEL_PREFIX for input indel vcfs that aren't split by chromosome.
 +
 +
'''Genotype Refinement'''
 +
* Only run beagle/thunder with more than 1 sample
 +
 +
'''Indel'''
 +
* mergeBams for a single sample as its own step (didn't work before)
 +
* Fix bug that it would fail if the list of files was too long
 +
* Add input validation
 +
* Validate format of BAM_INDEX file
 +
 +
 +
== Version 1.13 (Full Release on 7/15/2014) ==
 +
''' General '''
 +
* Cleanup runcluster
 +
* Upgrade to bamUtil v1.0.12a
 +
* Upgrade to libStatGen v1.0.12
 +
* Update README to add build instructions & wiki references
 +
 +
'''Aligner'''
 +
* Increment to latest VerifyBamID
 +
 +
'''Variant Calling'''
 +
* Update glfMultiples to handle when first glf is empty
 +
* Add check for the output file before creating the .OK file
 +
* VcfPileup - improve return codes
 +
* Write jobfiles into a sub-directory
 +
* Added a snpcall monitoring utility
 +
* VcfSplit - update to only append .gz in the vcflist if there was at least one file
 +
* Write start/stop timestamps into a logfile (generated by runcluster)
 +
 +
'''Genotype Refinement'''
 +
* Update beagle2Vcf.pl to use 255 for missing PL/PL3 values
 +
* Update vcf2Beagle and beagle2Vcf to handle biallelic indels
 +
** Still doesn't handle any multiallelic variants
 +
* Added a ldrefine test.
 +
 +
'''Indel Calling'''
 +
* Initial version of Indel Caller
 +
** Still in testing phases, if you use, please provide feedback.
 +
 +
== Version 1.12 (Full Release on 1/17/2014) ==
 +
''' General '''
 +
* GotCloud now works when installed in a bin/ directory.
 +
* Add tabix source and build & bgzip build
 +
* Add some Copyright information
 +
* Fix printing of a failed run's return code
 +
* Upgrade to latest [[LibStatGen_Download#Official Releases|libStatGen]] & [[BamUtil#Release_of_just_BamUtil_.28does_not_include_libStatGen.29|bamUtil]].  See links for version details.
 +
** Slightly newer than 1.0.10 for both - versions on 1/17/2014.
 +
** dedup & recab now ignore Secondary reads
 +
** mergeBam ignores PI header field when merging
 +
** Add PhoneHome - gotCloud applies a PhoneHome thinning (BAMUTIL_THINNING) defaulted to 10 (10% of the time bamUtil does PhoneHome)
 +
* Upgrade QPLOT to ignore secondary reads
 +
* samtools
 +
** Update samtools index to return an error code if it fails to build the index
 +
 +
'''Aligner'''
 +
* Updgrade BWA
 +
** BWA_MEM is now an option
 +
* Write timestamps to Makefile log as steps start & complete
 +
* Remove tmp files as gotCloud goes, rather than at the end.
 +
* Deprecate RUN_QPLOT & RUN_VERIFY_BAM_ID
 +
** Now the steps to run are specified in configuration.
 +
* Mosaik
 +
** Upgrade to version from Oct 29, 2013
 +
** Add premo for pre-Mosaik processing
 +
 +
'''Variant Calling'''
 +
* Update to properly handle empty VCFs
 +
* Run make with -k option to run as much as possible after a failure.
 +
* Update to allow steps to be dependent on BAMs (BAM_DEPEND) so they will rerun if a BAM has a newer timestamp.
 +
* Input Validation
 +
** Check that BAMs exist & are not empty prior to running steps that require BAMs.
 +
** Check that filters min/maxDP are numbers, not fractions.
 +
* GlfMultiples
 +
** update to use DP instead of GD and fix PL description in format field header
 +
** add region option
 +
* samtools-hybrid
 +
** fail on missing BGZF EOF indicator
 +
 +
'''Genotype Refinement'''
 +
* Add a default number of states to Thunder
 +
 +
== Version 1.11 (Full Release on 9/6/2013) ==
 +
'''Aligner'''
 +
* Remove an extra space from the Makefile for the dedup command.
 +
* Brought in latest bwa source, but it is not yet being used.
 +
 +
'''Variant Calling'''
 +
* Rename OUT_PREFIX to MAKE_BASE_NAME to specify the base filename for snpcall, ldrefine (beagle & thunder), & vc Makefiles.  The typeOfRun.Makefile is appended to MAKE_BASE_NAME.
 +
** These Makefiles all used to have the same name and would overwrite each other
 +
** --makebasename/--make_basename/--make_base_name can be specified on the command-line
 +
** Default value for MAKE_BASE_NAME is umake
 +
*** snpcall is now: $(MAKE_BASE_NAME).snpcall.Makefile (default umake.snpcall.Makefile)
 +
*** ldrefine beagle step is now: $(MAKE_BASE_NAME).beagle.Makefile (default umake.beagle.Makefile)
 +
*** ldrefine thunder step is now: $(MAKE_BASE_NAME).thunder.Makefile (default umake.thunder.Makefile)
 +
*** vc is now: $(MAKE_BASE_NAME).vc.Makefile (default umake.vc.Makefile)
 +
* Added <code>gotcloud beagle</code> and <code>gotcloud thunder</code> commands so that beagle/thunder can be called independently rather than just through <code>ldrefine</code>.
 +
* Add command-line options to <code>gotcloud vc</code> for running just certain steps rather than having to set RUN...=true in the configuration
 +
** More than one --commandToRun can be specified at once
 +
** New commands-line options:
 +
*** <code>--index</code>  (or <code>RUN_INDEX = true</code> in the configuration file)
 +
*** <code>--pileup</code>  (or <code>RUN_PILEUP = true</code> in the configuration file)
 +
*** <code>--glfMultiples</code>  (or <code>RUN_GLFMULTIPLES = true</code> in the configuration file)
 +
*** <code>--vcfPileup</code>  (or <code>RUN_VCFPILEUP = true</code> in the configuration file)
 +
*** <code>--filter</code>  (or <code>RUN_FILTER = true</code> in the configuration file)
 +
*** <code>--svm</code>  (or <code>RUN_SVM = true</code> in the configuration file)
 +
*** <code>--split</code>  (or <code>RUN_SPLIT = true</code> in the configuration file)
 +
* Cleaned up the snpcall Makefile entries for pileup.  It used to print targets/commands that were never executed.  These unused targets have now been removed
 +
 +
'''Aligner & Variant Calling'''
 +
* Remove trailing spaces from configuration values
 +
* Add MAKE_OPTS configuration value that allows users to add Makefile options to the make calls that run the pipelines.
 +
* Update gccalcstorage for better estimates and to have option to print estimates from a starting size rather than from actually input files
 +
 +
== Version 1.10 (Full Release on 8/22/2013) ==
 +
'''Aligner'''
 +
* Update gccalcstorage for better align estimates
 +
 +
'''Variant Calling'''
 +
* Add additional comments to umake.pl
 +
* Update vcf-summary to print the skipped counts
 +
* Add option to specify the REF_FAI file used by the umake (gotcloud) script for determining CHRs and their lengths.
 +
 +
'''Aligner & Variant Calling'''
 +
* Only print Configuration settings to a file if the file doesn't exist
 +
 +
== Version 1.09a (Full Release on 8/08/2013) ==
 +
'''Aligner'''
 +
* Fix relative paths
 +
* Upgrade to newest samtools (and add source)
 +
* Update gcrunsummary.pl - summary stats for the run.
 +
* Upgrade to newer Mosaik
 +
 +
'''Variant Calling'''
 +
* Fix minNS filter for odd number of samples.  It used to give a fraction and then would be ignored.
 +
 +
'''Aligner & Variant Calling'''
 +
* Cleanup phonehome script
 +
* Cleanup gotcloud script and add ability to run perf/audria for dev purposes.
 +
 +
== Version 1.08 (Full Release on 7/31/2013) ==
 +
 +
'''Aligner'''
 +
* no aligner only changes
 +
 +
'''Variant Calling'''
 +
* Add the ability to copy a glf to a different directory prior to running glfExtract or glfMultiples
 +
* Remove chromosome Y from the default CHRS.  Also allow CHRS to be set on the commandline via a comma separated list specified in --chrs
 +
* Update glfMerge to skip glf files that only have a header.
 +
* Change default FILTER_MAX_SAMPLE_DP to 1000 (from 20)
 +
* Some SVM updates
 +
* Added the vc option to gotcloud which uses the RUN_...settings to decide which steps to use.
 +
 +
'''Aligner & Variant Calling'''
 +
* Fix bug in Conf.pm that caused a failure in some versions of perl
 +
* Add the ability to set the GOTCLOUD_ROOT so you can test with an alternate align.pl/umake.pl script and still be able to access everything else from the standard gotcloud path.
 +
* Cleanup the perldoc for align/snpcall
 +
* Output all configuration settings into a file when running.
 +
* Upgrade to most current libStatGen
 +
* Compile as optimized
 +
 +
== Version 1.07 (Full Release on 7/3/2013) ==
 +
 +
'''Aligner'''
 +
* DEPRECATED configuration settings:
 +
** 'BWA_MAX_MEM' is now 'SORT_MAX_MEM'
 +
** 'VERIFY_BAM_ID_OPTIONS' is now 'verifyBamID_USER_PARAMS'
 +
* ALN_TMP now defaults to $(TMP_DIR)/alignment.aln rather than $(TMP_DIR)/alignment.bwa
 +
* Upgrade to latest QPLOT
 +
** GC Content file has been renamed to have the extension: .winsize100.gc
 +
* Automatically generates the bam index file if BAM_INDEX is specified
 +
* Run DEDUP & RECAB as 1 step instead of 2
 +
* Update dedup, recab, qplot, & verifyBamID steps to be specified via configuration
 +
** Easier to insert steps between/before/after these
 +
** Use PER_MERGE_STEPS to disable any of these steps (see gotcloudDefaults.conf for its default setting)
 +
*** RUN_QPLOT and RUN_VERIFY_BAM_ID are only used for validating executable/reference existence and will be deprecated completely soon
 +
* Fixed bug where the merge failed if there was only 1 fastq pair
 +
* Improve informational messages
 +
* Update to BWA version 0.6.1-r104
 +
* Bring in mergeBam updates from latest bamUtil
 +
** ignore PG lines with duplicate ids
 +
* General code cleanup
 +
* Add some Mosaik support
 +
** Added support to align.pl and a way to enable it, but the code doesn't compile
 +
* Calculate approximate storage needed for GotCloud so user can have an idea what is coming
 +
* Makefile now uses bash and pipefail to catch errors that occur within piped commands
 +
* Removed the md5sum calculation
 +
 +
'''Variant Calling'''
 +
* Update to always require REF
 +
** this fixes bug that ''ldrefine'' was not checking REF or adding the optional prefix to it.
 +
* SVM - fix bug on qual check in run_libsvm.pl
 +
* Update defaults for filtering
 +
* Fixed bug in libVcf/VcfFile that had FamID instead of FatID
 +
* Fixed bug in samtools-hybrid that caused it to fail when checking for BAI files if bam was elsewhere in the filename
 +
* Fix vcfPielup to accept .bam.bai or .bai in bam index filenames.
 +
* Fix the split logic to work if a VCF file had no PASS records
 +
 +
'''Aligner & Variant Calling'''
 +
* Add checks for required executables prior to running
 +
* Limit the number of jobs that can run locally (there is a flag to override this)
 +
* Extract configuration routines from the 2 .pl's to a common Conf.pm
 +
* Add FLUX support
 +
* 1st attempt at checking for new versions
 +
** Doesn't quite always work yet, but shouldn't cause a problem
 +
 +
== Version 1.06 (Full Release on 4/17/2013) ==
 +
 +
'''Variant Calling'''
 +
* Update to always require REF
 +
** this fixes bug that ''ldrefine'' was not checking REF or adding the optional prefix to it.
 +
 +
 
== Version 1.05 (Full Release on 4/17/2013) ==
 
== Version 1.05 (Full Release on 4/17/2013) ==
  
''Aligner & SnpCalling'''
+
'''Aligner & Variant Calling'''
 
* Cleanup handling of BASE_PREFIX & added REF_PREFIX.   
 
* Cleanup handling of BASE_PREFIX & added REF_PREFIX.   
 
** Allows user to specify --base_prefix or --baseprefix on command-line
 
** Allows user to specify --base_prefix or --baseprefix on command-line
** Now used for index files & reference files in addition to fastqs (aligner) and bams (snpcalling)
+
** Now used for index files & reference files in addition to fastqs (aligner) and bams (variant calling)
  
  
 
== Version 1.04 (Full Release on 4/16/2013) ==
 
== Version 1.04 (Full Release on 4/16/2013) ==
  
'''Aligner & UMAKE/snpcalling'''
+
'''Aligner & Variant Calling'''
 
* Update relative paths to be relative to the current working directory
 
* Update relative paths to be relative to the current working directory
 
** Aligner effects:
 
** Aligner effects:
 
*** INDEX_FILE as specified in the aligner configuration
 
*** INDEX_FILE as specified in the aligner configuration
 
*** fastq paths specified in the INDEX_FILE
 
*** fastq paths specified in the INDEX_FILE
** UMAKE/snpcalling effects:
+
** Variant Calling effects:
 
** BAM_INDEX as specified in the configuration
 
** BAM_INDEX as specified in the configuration
 
** bam paths specified in the BAM_INDEX
 
** bam paths specified in the BAM_INDEX
Line 21: Line 352:
 
** FASTQ_PREFIX (for aligner reading the fastq index file)
 
** FASTQ_PREFIX (for aligner reading the fastq index file)
 
*** renamed from FASTQ/FASTQ_REF
 
*** renamed from FASTQ/FASTQ_REF
** BAM_PREFIX (for umake/snpcalling reading bam index file)
+
** BAM_PREFIX (for variant calling reading bam index file)
 
* Improve Error detection
 
* Improve Error detection
 
** With --test option, check that the testdir exists before running the test
 
** With --test option, check that the testdir exists before running the test
Line 37: Line 368:
 
** Also add support for fixing the problem with UMich directories when using Mosix
 
** Also add support for fixing the problem with UMich directories when using Mosix
 
* Update the default Reference directory to be as expected for UM
 
* Update the default Reference directory to be as expected for UM
* snpcalling changes:
+
* Variant Calling changes:
 
** SVM
 
** SVM
 
*** Add option to merge all chromosome sites prior to running SVM (to better support targeted sequencing)
 
*** Add option to merge all chromosome sites prior to running SVM (to better support targeted sequencing)
Line 45: Line 376:
 
* Add pre-checks for required files & reference files prior to running
 
* Add pre-checks for required files & reference files prior to running
 
* Add checks for deprecated configuration settings
 
* Add checks for deprecated configuration settings
* Merge aligner & snpcalling default configurations into a single file (bin/gotcloudDefaults.conf)
+
* Merge aligner & variant calling default configurations into a single file (bin/gotcloudDefaults.conf)
 
* Aligner
 
* Aligner
 
** Update to put actual values into the Makefile recipes rather than using variables
 
** Update to put actual values into the Makefile recipes rather than using variables
* SnpCalling
+
* Variant Calling
 
** Fix vcf-summary to handle chromosomes that have string names (like X,Y)
 
** Fix vcf-summary to handle chromosomes that have string names (like X,Y)
  
 
== Version 1.03a4 (Internal Only Release on 4/2/2013) ==
 
== Version 1.03a4 (Internal Only Release on 4/2/2013) ==
* snpCalling:
+
* Variant Calling:
 
** Update to by default run as local
 
** Update to by default run as local
 
** Target Loci file updates:
 
** Target Loci file updates:
Line 67: Line 398:
  
 
== Version 1.03a1 (Internal Only Release on 3/26/2013) ==
 
== Version 1.03a1 (Internal Only Release on 3/26/2013) ==
* SnpCall
+
* Variant Calling
 
** Add FILTER_MIN_NS to add the option of filtering based on the number of samples
 
** Add FILTER_MIN_NS to add the option of filtering based on the number of samples
 
** Add FILTER_ADDITIONAL to add the option of adding additional filters.
 
** Add FILTER_ADDITIONAL to add the option of adding additional filters.
  
== Version 1.03a (Full Release on 3/22/2913) ==
+
== Version 1.03a (Full Release on 3/22/2013) ==
 
* Cleanup README & INSTALL instructions
 
* Cleanup README & INSTALL instructions
* SnpCall
+
* Variant Calling
 
** Fix dependency bug/error in SVM
 
** Fix dependency bug/error in SVM
 
** Fix commands that run locally to check for pipe failures
 
** Fix commands that run locally to check for pipe failures

Latest revision as of 12:17, 14 May 2015

For information on installing the releases, see: Install GotCloud Software

For information on issues/resolutions for specific versions, see: FAQ: Version Problems

Version 1.17 (Full Release on 5/14/2015)

Source can be downloaded from: https://github.com/statgen/gotcloud/releases/tag/gotcloud.1.17

General

  • Add ability to run custom pipelines
  • Fix bug in libVcfVcfFile.cpp
  • Fix some compatibility issues for CentOS5

Aligner

  • Add pipelines to run just recab & QC, and just QC.
  • VerifyBamID
    • Exclude ChrX & Y

SnpCall

Genotype Refinement

Indel

GenomeSTRiP


Version 1.16 (Full Release on 2/25/2015)

Source can be downloaded from: https://github.com/statgen/gotcloud/releases/tag/gotcloud.1.16

General

  • Update the default REF to hs37d5.fa (build 37 with decoy) and the default DBSNP_VCF to dbsnp version 142.
  • Upgrade perl scripts to use /usr/bin/env perl instead of /usr/bin/perl to make it compatible with more users
  • Upgrade to latest versions of libStatGen and bamUtil (versions 1.0.13)
    • Fixes bug in calculating the MD5s for the fasta in polishBam

Aligner

  • Update default aligner to bwa mem
    • you can still use bwa aln (the previous default) by adding the following setting to your configuration file:
      MAP_TYPE = BWA
  • Upgrade to bwa version 0.7.12
  • No longer call verifyBamID with the --verbose option


SnpCall

Genotype Refinement

Indel

  • Cleanup pipeline.pl to reduce errors in some versions of perl

GenomeSTRiP

Version 1.15 (Full Release on 12/16/2014)

General

  • Rename BAM_INDEX to BAM_LIST
  • Change default REF_DIR
  • Add ref_dir and list as command-line options to all pipelines
  • Add bed-diff script to compare VCFs

Aligner

  • By default, create BAM_LIST
  • Use SAMPLE instead of MERGE_NAME if MERGE_NAME is not specified in FASTQ_LIST
  • No longer require fastqs to end in 'fastq.gz' or 'fastq'
  • Rename INDEX_FILE to FASTQ_LIST and infer all fields except FASTQ1, FASTQ2, and either SAMPLE or MERGE_NAME
  • Change --numcs to --numjobs and what was --numjobs to --threads
  • Update to latest BWA
    • Update aligner to pass \t instead of tabs for the RG fieldto new version of BWA
  • By default, no longer store OQ

SnpCall

  • Add validation that:
    • Each BAM has only 1 sample
    • BAM's sampleID matches id in BAM_LIST
      • Use --ignoreSMcheck to disable this validation
  • Updated Exome/Targeted settings
    • Set TARGET_DIR and OFFSET_OFF_TARGET (0) in defaults
    • Remove WRITE_TARGET_LOCI and base it on whether or not UNIFORM_TARGET_BED/MULTIPLE_TARGET_MAP are set and either the loci file doesn't exist, is older than the bed, or was created by a different bed
  • Add validation that tabix in perl scripts succeed
  • Fix some bugs in glfFlex & add region option
  • Cleanup logs so they no longer spew to the screen
  • Add ext-filt option for single sample filtering
  • Add .OK file after vcflist file successfully created

Genotype Refinement

  • Add validation that tabix in perl scripts succeed
  • Add .OK file after vcflist file successfully created

Indel

  • Update default region settings
  • Move output directories to an "indel" folder

GenomeSTRiP

  • Add a GenomeSTRiP pipeline


Version 1.14 (Full Release on 8/29/2014)

General

  • Add initial beagle4 support (as a new pipeline)
  • Improve input validation
    • Add chromosome name consistency checks to all tools
  • Upgrade version of bgzf
  • Upgrade libStatGen to fix mergeBam issue.

Aligner

  • Cleanup reading of fastq index/info file
    • ignore empty lines (generates a warning)
    • compress extra tabs/trim white space
  • Validate that BWA_QUAL and BWA_THREADS settings are properly formatted

SnpCall

  • Replace glfMultiples with glfFlex
  • Validate format of BAM_INDEX file
  • Add INDEL_VCF as an alternate for INDEL_PREFIX for input indel vcfs that aren't split by chromosome.

Genotype Refinement

  • Only run beagle/thunder with more than 1 sample

Indel

  • mergeBams for a single sample as its own step (didn't work before)
  • Fix bug that it would fail if the list of files was too long
  • Add input validation
  • Validate format of BAM_INDEX file


Version 1.13 (Full Release on 7/15/2014)

General

  • Cleanup runcluster
  • Upgrade to bamUtil v1.0.12a
  • Upgrade to libStatGen v1.0.12
  • Update README to add build instructions & wiki references

Aligner

  • Increment to latest VerifyBamID

Variant Calling

  • Update glfMultiples to handle when first glf is empty
  • Add check for the output file before creating the .OK file
  • VcfPileup - improve return codes
  • Write jobfiles into a sub-directory
  • Added a snpcall monitoring utility
  • VcfSplit - update to only append .gz in the vcflist if there was at least one file
  • Write start/stop timestamps into a logfile (generated by runcluster)

Genotype Refinement

  • Update beagle2Vcf.pl to use 255 for missing PL/PL3 values
  • Update vcf2Beagle and beagle2Vcf to handle biallelic indels
    • Still doesn't handle any multiallelic variants
  • Added a ldrefine test.

Indel Calling

  • Initial version of Indel Caller
    • Still in testing phases, if you use, please provide feedback.

Version 1.12 (Full Release on 1/17/2014)

General

  • GotCloud now works when installed in a bin/ directory.
  • Add tabix source and build & bgzip build
  • Add some Copyright information
  • Fix printing of a failed run's return code
  • Upgrade to latest libStatGen & bamUtil. See links for version details.
    • Slightly newer than 1.0.10 for both - versions on 1/17/2014.
    • dedup & recab now ignore Secondary reads
    • mergeBam ignores PI header field when merging
    • Add PhoneHome - gotCloud applies a PhoneHome thinning (BAMUTIL_THINNING) defaulted to 10 (10% of the time bamUtil does PhoneHome)
  • Upgrade QPLOT to ignore secondary reads
  • samtools
    • Update samtools index to return an error code if it fails to build the index

Aligner

  • Updgrade BWA
    • BWA_MEM is now an option
  • Write timestamps to Makefile log as steps start & complete
  • Remove tmp files as gotCloud goes, rather than at the end.
  • Deprecate RUN_QPLOT & RUN_VERIFY_BAM_ID
    • Now the steps to run are specified in configuration.
  • Mosaik
    • Upgrade to version from Oct 29, 2013
    • Add premo for pre-Mosaik processing

Variant Calling

  • Update to properly handle empty VCFs
  • Run make with -k option to run as much as possible after a failure.
  • Update to allow steps to be dependent on BAMs (BAM_DEPEND) so they will rerun if a BAM has a newer timestamp.
  • Input Validation
    • Check that BAMs exist & are not empty prior to running steps that require BAMs.
    • Check that filters min/maxDP are numbers, not fractions.
  • GlfMultiples
    • update to use DP instead of GD and fix PL description in format field header
    • add region option
  • samtools-hybrid
    • fail on missing BGZF EOF indicator

Genotype Refinement

  • Add a default number of states to Thunder

Version 1.11 (Full Release on 9/6/2013)

Aligner

  • Remove an extra space from the Makefile for the dedup command.
  • Brought in latest bwa source, but it is not yet being used.

Variant Calling

  • Rename OUT_PREFIX to MAKE_BASE_NAME to specify the base filename for snpcall, ldrefine (beagle & thunder), & vc Makefiles. The typeOfRun.Makefile is appended to MAKE_BASE_NAME.
    • These Makefiles all used to have the same name and would overwrite each other
    • --makebasename/--make_basename/--make_base_name can be specified on the command-line
    • Default value for MAKE_BASE_NAME is umake
      • snpcall is now: $(MAKE_BASE_NAME).snpcall.Makefile (default umake.snpcall.Makefile)
      • ldrefine beagle step is now: $(MAKE_BASE_NAME).beagle.Makefile (default umake.beagle.Makefile)
      • ldrefine thunder step is now: $(MAKE_BASE_NAME).thunder.Makefile (default umake.thunder.Makefile)
      • vc is now: $(MAKE_BASE_NAME).vc.Makefile (default umake.vc.Makefile)
  • Added gotcloud beagle and gotcloud thunder commands so that beagle/thunder can be called independently rather than just through ldrefine.
  • Add command-line options to gotcloud vc for running just certain steps rather than having to set RUN...=true in the configuration
    • More than one --commandToRun can be specified at once
    • New commands-line options:
      • --index (or RUN_INDEX = true in the configuration file)
      • --pileup (or RUN_PILEUP = true in the configuration file)
      • --glfMultiples (or RUN_GLFMULTIPLES = true in the configuration file)
      • --vcfPileup (or RUN_VCFPILEUP = true in the configuration file)
      • --filter (or RUN_FILTER = true in the configuration file)
      • --svm (or RUN_SVM = true in the configuration file)
      • --split (or RUN_SPLIT = true in the configuration file)
  • Cleaned up the snpcall Makefile entries for pileup. It used to print targets/commands that were never executed. These unused targets have now been removed

Aligner & Variant Calling

  • Remove trailing spaces from configuration values
  • Add MAKE_OPTS configuration value that allows users to add Makefile options to the make calls that run the pipelines.
  • Update gccalcstorage for better estimates and to have option to print estimates from a starting size rather than from actually input files

Version 1.10 (Full Release on 8/22/2013)

Aligner

  • Update gccalcstorage for better align estimates

Variant Calling

  • Add additional comments to umake.pl
  • Update vcf-summary to print the skipped counts
  • Add option to specify the REF_FAI file used by the umake (gotcloud) script for determining CHRs and their lengths.

Aligner & Variant Calling

  • Only print Configuration settings to a file if the file doesn't exist

Version 1.09a (Full Release on 8/08/2013)

Aligner

  • Fix relative paths
  • Upgrade to newest samtools (and add source)
  • Update gcrunsummary.pl - summary stats for the run.
  • Upgrade to newer Mosaik

Variant Calling

  • Fix minNS filter for odd number of samples. It used to give a fraction and then would be ignored.

Aligner & Variant Calling

  • Cleanup phonehome script
  • Cleanup gotcloud script and add ability to run perf/audria for dev purposes.

Version 1.08 (Full Release on 7/31/2013)

Aligner

  • no aligner only changes

Variant Calling

  • Add the ability to copy a glf to a different directory prior to running glfExtract or glfMultiples
  • Remove chromosome Y from the default CHRS. Also allow CHRS to be set on the commandline via a comma separated list specified in --chrs
  • Update glfMerge to skip glf files that only have a header.
  • Change default FILTER_MAX_SAMPLE_DP to 1000 (from 20)
  • Some SVM updates
  • Added the vc option to gotcloud which uses the RUN_...settings to decide which steps to use.

Aligner & Variant Calling

  • Fix bug in Conf.pm that caused a failure in some versions of perl
  • Add the ability to set the GOTCLOUD_ROOT so you can test with an alternate align.pl/umake.pl script and still be able to access everything else from the standard gotcloud path.
  • Cleanup the perldoc for align/snpcall
  • Output all configuration settings into a file when running.
  • Upgrade to most current libStatGen
  • Compile as optimized

Version 1.07 (Full Release on 7/3/2013)

Aligner

  • DEPRECATED configuration settings:
    • 'BWA_MAX_MEM' is now 'SORT_MAX_MEM'
    • 'VERIFY_BAM_ID_OPTIONS' is now 'verifyBamID_USER_PARAMS'
  • ALN_TMP now defaults to $(TMP_DIR)/alignment.aln rather than $(TMP_DIR)/alignment.bwa
  • Upgrade to latest QPLOT
    • GC Content file has been renamed to have the extension: .winsize100.gc
  • Automatically generates the bam index file if BAM_INDEX is specified
  • Run DEDUP & RECAB as 1 step instead of 2
  • Update dedup, recab, qplot, & verifyBamID steps to be specified via configuration
    • Easier to insert steps between/before/after these
    • Use PER_MERGE_STEPS to disable any of these steps (see gotcloudDefaults.conf for its default setting)
      • RUN_QPLOT and RUN_VERIFY_BAM_ID are only used for validating executable/reference existence and will be deprecated completely soon
  • Fixed bug where the merge failed if there was only 1 fastq pair
  • Improve informational messages
  • Update to BWA version 0.6.1-r104
  • Bring in mergeBam updates from latest bamUtil
    • ignore PG lines with duplicate ids
  • General code cleanup
  • Add some Mosaik support
    • Added support to align.pl and a way to enable it, but the code doesn't compile
  • Calculate approximate storage needed for GotCloud so user can have an idea what is coming
  • Makefile now uses bash and pipefail to catch errors that occur within piped commands
  • Removed the md5sum calculation

Variant Calling

  • Update to always require REF
    • this fixes bug that ldrefine was not checking REF or adding the optional prefix to it.
  • SVM - fix bug on qual check in run_libsvm.pl
  • Update defaults for filtering
  • Fixed bug in libVcf/VcfFile that had FamID instead of FatID
  • Fixed bug in samtools-hybrid that caused it to fail when checking for BAI files if bam was elsewhere in the filename
  • Fix vcfPielup to accept .bam.bai or .bai in bam index filenames.
  • Fix the split logic to work if a VCF file had no PASS records

Aligner & Variant Calling

  • Add checks for required executables prior to running
  • Limit the number of jobs that can run locally (there is a flag to override this)
  • Extract configuration routines from the 2 .pl's to a common Conf.pm
  • Add FLUX support
  • 1st attempt at checking for new versions
    • Doesn't quite always work yet, but shouldn't cause a problem

Version 1.06 (Full Release on 4/17/2013)

Variant Calling

  • Update to always require REF
    • this fixes bug that ldrefine was not checking REF or adding the optional prefix to it.


Version 1.05 (Full Release on 4/17/2013)

Aligner & Variant Calling

  • Cleanup handling of BASE_PREFIX & added REF_PREFIX.
    • Allows user to specify --base_prefix or --baseprefix on command-line
    • Now used for index files & reference files in addition to fastqs (aligner) and bams (variant calling)


Version 1.04 (Full Release on 4/16/2013)

Aligner & Variant Calling

  • Update relative paths to be relative to the current working directory
    • Aligner effects:
      • INDEX_FILE as specified in the aligner configuration
      • fastq paths specified in the INDEX_FILE
    • Variant Calling effects:
    • BAM_INDEX as specified in the configuration
    • bam paths specified in the BAM_INDEX
  • Add getAbsPath() method for determining the absolute path with the additional capability of prepending an optional PREFIX (as specified in configuration) to the directory:
    • BASE_PREFIX
    • FASTQ_PREFIX (for aligner reading the fastq index file)
      • renamed from FASTQ/FASTQ_REF
    • BAM_PREFIX (for variant calling reading bam index file)
  • Improve Error detection
    • With --test option, check that the testdir exists before running the test

Cluster Support

  • Update the mosix option to run mosbatch instead of mosrun
  • Only attempt to "fix" the CWD for mosix/mosbatch
    • Remove the warning if this "fix" fails
    • This "fix" is specific for running at UM, but should not cause a failure when running elsewhere

Includes all updates from previous Internal Only Releases.

Version 1.03a6 (Internal Only Release on 4/10/2013)

  • Cleanup the cluster support code
    • Also add support for fixing the problem with UMich directories when using Mosix
  • Update the default Reference directory to be as expected for UM
  • Variant Calling changes:
    • SVM
      • Add option to merge all chromosome sites prior to running SVM (to better support targeted sequencing)
    • Cleanup some of the Makefile dependencies to depend on files rather than phony targets (this prevents it from always rerunning those steps)

Version 1.03a5 (Internal Only Release on 4/5/2013)

  • Add pre-checks for required files & reference files prior to running
  • Add checks for deprecated configuration settings
  • Merge aligner & variant calling default configurations into a single file (bin/gotcloudDefaults.conf)
  • Aligner
    • Update to put actual values into the Makefile recipes rather than using variables
  • Variant Calling
    • Fix vcf-summary to handle chromosomes that have string names (like X,Y)

Version 1.03a4 (Internal Only Release on 4/2/2013)

  • Variant Calling:
    • Update to by default run as local
    • Target Loci file updates:
      • When WRITE_TARGET_LOCI is set to true: only generate the .loci file if the specified bed is newer than the loci file
      • When WRITE_TARGET_LOCI is set to ALWAYS, generate the .loci file regardless of the timestamps
    • Only create the glf index file for a region if it does not exist or is older than the bam index file

Version 1.03a3 (Internal Only Release on 3/29/2013)

  • Attempted to Fix bug that it wasn't properly running batching
    • This version was not good (fixed in 1.034a.

Version 1.03a2 (Internal Only Release on 3/27/2013)

  • Add the qplot source code

Version 1.03a1 (Internal Only Release on 3/26/2013)

  • Variant Calling
    • Add FILTER_MIN_NS to add the option of filtering based on the number of samples
    • Add FILTER_ADDITIONAL to add the option of adding additional filters.

Version 1.03a (Full Release on 3/22/2013)

  • Cleanup README & INSTALL instructions
  • Variant Calling
    • Fix dependency bug/error in SVM
    • Fix commands that run locally to check for pipe failures
    • Improve file open error detection in SVM logic
  • Add option to obtain the version number

Version 1.03 (Full Release on 3/15/2013)

  • Add SVM Filtering
    • there was a bug in this, please do not use this version.
    • Version 1.03a fixes this bug.

Version 1.02 (Full Release on 3/13/2013)

  • Cleanup cluster scripts
  • Rename alinger to align.pl & umake to snp
  • Add VerifyBamID source
  • MANY Updates, please use a newer version.