Line 1: |
Line 1: |
− | [[Category:Software Libraries]]
| |
| [[Category:C++]] | | [[Category:C++]] |
| [[Category:libStatGen]] | | [[Category:libStatGen]] |
| | | |
− | = DESCRIPTION = | + | = Description = |
| Open source, freely available (GPL license), easy to use C++ APIs | | Open source, freely available (GPL license), easy to use C++ APIs |
| * General Operation Classes including: | | * General Operation Classes including: |
Line 9: |
Line 8: |
| ** String processing | | ** String processing |
| ** Parameter Parsing | | ** Parameter Parsing |
− | * Statistical Genetic Specific Classes including: | + | * '''Statistical Genetic Specific Classes''' including: |
− | **Handling Common file formats – SAM/BAM, FASTQ, GLF | + | **Handling Common file formats – SAM/BAM, FASTQ, GLF, VCF (coming soon) |
| ***Accessors to get/set values | | ***Accessors to get/set values |
| ***Indexed access to BAM files | | ***Indexed access to BAM files |
Line 16: |
Line 15: |
| ***Cigar – interpretation and mapping between query and reference | | ***Cigar – interpretation and mapping between query and reference |
| ***Pileup – structured access to data by individual reference position | | ***Pileup – structured access to data by individual reference position |
| + | |
| + | Can be used to create your own C++ programs. |
| + | |
| + | Currently the repository is recommended for Unix/Linux users with access to the GNU C++ compiler. |
| + | |
| + | |
| + | = Copyrights = |
| + | '''If you use this software, please e-mail me, Mary Kate Wing, at mktrost@umich.edu''' |
| + | |
| + | Here are links to the copyrights for our code and some of the utilities it uses: |
| + | *[https://github.com/statgen/libStatGen/blob/master/general/COPYING GNU GENERAL PUBLIC LICENSE] and [https://github.com/statgen/libStatGen/blob/master/general/LICENSE.txt Our Copyright Note] |
| + | *[https://github.com/statgen/libStatGen/blob/master/general/LICENSE.twister Copyright for MERSENNE TWISTER (used in Random.cpp)] |
| + | *[https://github.com/statgen/libStatGen/blob/master/samtools/COPYING Samtools Copyright (MIT License)] |
| + | Copies of these can be found in our library under libStatGen/copyrights/. |
| + | |
| + | = Join in libStatGen mailing list = |
| + | |
| + | Please join in the [http://groups.google.com/group/libStatGen libStatGen Google Group] to ask / discuss / comment about this library. |
| + | |
| + | |
| + | = Troubleshooting = |
| + | If you are having trouble compiling any of the versions, check [[libStatGen Troubleshooting]] for help. If that does not solve your problem, email me for support. |
| + | |
| + | |
| + | = Where to Find It = |
| + | |
| + | {{ToolGitRepo|repoName=libStatGen|libStatGen=true|libBaseName=libStatGen}} |
| + | |
| + | == Releases == |
| + | Released Versions are documented at [[libStatGen Download]] |
| + | |
| + | = What has changed = |
| + | The <code>pipeline</code> and <code>statgen</code> repositories have been deprecated, so please update to our new framework. |
| + | |
| + | <code>libStatGen</code> is the new git repository for our library code. |
| + | |
| + | There are now separate repositories for specific tools/groups of tools, allowing us to track everything separately so it is easier to follow changes that impact a specific tool or the library in general. |
| | | |
| | | |
| = Library Documentation = | | = Library Documentation = |
− | Latest Doxygen documentation: | + | Latest Doxygen documentation: |
− | [http://www.sph.umich.edu/csg/mktrost/doxygen/march22_2011/ 3/22/11 Library Documentation in Doxygen] | + | <!-- <a href="http://csg.sph.umich.edu//abecasis/GOLD/ --> |
| + | <!-- [http://www.sph.umich.edu/csg/mktrost/doxygen/current/ Current Library Documentation in Doxygen] --> |
| + | [http://csg.sph.umich.edu//mktrost/doxygen/current/ Current Library Documentation in Doxygen] |
| + | |
| + | Additional documentation: |
| + | * [[libStatGen: general]] - General classes for file processing and performing common tasks (used by most other libraries). |
| + | * [[libStatGen: BAM]] - Classes specific for reading/writing/analyzing SAM/BAM files. |
| + | * [[libStatGen: GLF]] - Classes specific for reading/writing/analyzing GLF files. |
| + | * [[libStatGen: FASTQ]] - Classes specific for reading/writing/analyzing FastQ files. |
| + | * [[libStatGen: ASP]] - Classes specific for reading/writing/analyzing ASP files. |
| + | * [[libStatGen: VCF]] - Classes specific for reading/writing/analyzing VCF files. |
| + | |
| + | = Using the Library = |
| + | == Dependencies == |
| + | * This software requires the following to be installed: |
| + | ** g++ |
| + | ** development version of zlib (zlib1g-dev on ubuntu) |
| + | * Compiles on Linux/Unix |
| + | |
| + | == Building the Library == |
| + | |
| + | If you type make help, you get the build options. |
| + | <pre> |
| + | Makefile help |
| + | ------------- |
| + | Type... To... |
| + | make Compile opt |
| + | make help Display this help screen |
| + | make all Compile everything (opt, debug, & profile) |
| + | make opt Compile optimized |
| + | make debug Compile for debug |
| + | make profile Compile for profile |
| + | make clean Delete temporary files |
| + | make test Execute tests (if there are any) |
| + | </pre> |
| + | |
| + | When you just type make, it will by default to make opt (optimized). |
| + | |
| + | Make all indicates opt, debug, and profile. |
| + | |
| + | opt creates <code>libStatGen.a</code>, debug creates <code>libStatGen_debug.a</code>, profile creates <code>libStatGen_profile.a</code> |
| + | |
| + | These libraries are created in the top level libStatGen directory and can then be linked to appropriately for building tools as optimized, debugging, and/or profiling. |
| | | |
− | [http://www.sph.umich.edu/csg/mktrost/doxygen/version.0.1.2/ Version 0.1.2 Library Documentation in Doxygen]
| + | == Navigating the Library Subdirectories == |
| + | Under the main libStatGen repository, there are: |
| + | *bam - library code for operating on bam files. |
| + | *copyrights - copyrights for the library and any code included with it. |
| + | *fastq - library code for operating on fastq files. |
| + | *general - library code for general operations |
| + | *glf - library code for operating on glf files. |
| + | *include - after compiling, the library headers are linked here |
| + | *Makefiles - directory containing Makefiles that are used in the library and can be used for developing programs using the library |
| + | *samtools - library code used from samtools |
| | | |
− | Possibly out of date documentaitons:
| + | After Compiling: libStatGen.a, libStatGen_debug.a, libStatGen_profile.a are created at the top level. |
− | * [[StatGenLibrary: general]] - General classes for file processing and performing common tasks (used by most other libraries).
| |
− | * [[StatGenLibrary: BAM]] - Classes specific for reading/writing/analyzing SAM/BAM files.
| |
− | * [[StatGenLibrary: GLF]] - Classes specific for reading/writing/analyzing GLF files.
| |
− | * [[StatGenLibrary: FASTQ]] - Classes specific for reading/writing/analyzing FastQ files.
| |
| | | |
| + | === bam, fastq, general, glf, samtools === |
| + | Object files are placed in an obj directory under each subdirectory with debug & profile objects in obj/debug and obj/profile. |
| | | |
− | = Using the Library = | + | Most also have a test directory. Tests are executed by running <code>make test</code> |
− | To use the StatGen Library, first download and compile via [[StatGen Download | StatGen Download Instructions]] and [[StatGen Repository#Compile.2FBuild | StatGen Compile/Build Instructions]]
| + | |
| + | === Makefiles === |
| + | This directory contains base makefiles and makefile settings that are used by the library and by programs being written to use the library. |
| + | |
| + | == Using the Library in Your Own Program == |
| + | |
| + | === Starting from a Sample Program (Recommended) === |
| + | [https://github.com/statgen/SampleProgram https://github.com/statgen/SampleProgram] is a simple program demonstrating how to write a tool that uses libStatGen and can be used as a starting point for your tool. |
| + | |
| + | SampleProgram has 4 subdirectories: |
| + | * copyrights - contains the copyright information, add your own copyrights as necessary |
| + | * obj - this directory is where the object files are placed when the code is compiled (with a subdirectory for debug and profile objects) |
| + | * src - this is where your own program code goes |
| + | * test - this is where your test code goes. Test code can be setup to run with <code>make test</code> to ensure the program works properly. |
| + | |
| + | '''Using SampleProgram as a starting point for your tool:''' |
| + | # Copy SampleProgram into a directory with your program name (it is the starting point for your own program). |
| + | # Update ChangeLog, .gitignore, and README.txt as appropriate. |
| + | # Add any necessary copyrights to the copyrights directory. |
| + | #* No changes to Makefile should be necessary. |
| + | # Update Makefile.inc |
| + | ## Update the VERSION as necessary. |
| + | ## Replace all occurrences of <code>SAMPLE_PROGRAM</code> with an all caps name for your program. |
| + | ##* You can then use the <code>LIB_PATH_<your program name></code> environment variable to specify an alternate path to libStatGen specific for your program. In most cases you will not need to do this. |
| + | #* No other updates to Makefile.inc should be necessary. |
| + | # Add your program (cpp & h files) to the <code>src</code> directory. |
| + | # Update src/Makefile |
| + | ## Set EXE to your program executable (replacing sampleProgram) |
| + | ## Set TOOLBASE, SRCONLY, and HDRONLY as appropriate for specifying your program file names. |
| + | ## Set any of the other optional settings as specified in the sample makefile. |
| + | #* No other changes should be necessary to src/Makefile. |
| + | # Add your tests to the <code>test</code> directory. |
| + | # Update test/Makefile as appropriate for specifying how to compile/run your tests. |
| + | |
| + | |
| + | After compiling a <code>bin</code> directory is created in the top level directory. Your executable goes in there. If you build for <code>debug</code> and/or <code>profile</code>, subdirectories for those are created under <code>bin/</code> and <code>obj</code>. |
| + | |
| + | |
| + | === Working from Scratch === |
| + | When compiling your code, be sure to include the library header files found in libStatgen/include/ and link in the appropriate library (opt: libStatGen.a, debug: libStatGen_debug.a, or profile: libStatGen_profile.a). |
| + | |
| + | |
| + | === Starting from a Sample Set of Tools === |
| + | [https://github.com/statgen/SampleTools https://github.com/statgen/SampleTools] is a repository containing multiple programs within one directory structure. It demonstrates how to have subdirectories for each tool using libStatGen and can be used as a starting point for your set of tools. |
| + | |
| + | SampleTools has 3 subdirectories: |
| + | * copyrights - contains the copyright information, add your own copyrights as necessary |
| + | * SampleProgram1 - a dummy demo program to show the structure for having multiple programs |
| + | * SampleProgram2 - a second dummy demo program to show the structure for having multiple programs |
| | | |
| + | SampleProgram1 & SampleProgram2 have 2 subdirectories: |
| + | * src - this is where your own program code goes |
| + | * test - this is where your test code goes. Test code can be setup to run with <code>make test</code> to ensure the program works properly. |
| | | |
− | *Build: type make
| + | Upon compiling, an <code>obj</code> directory is created under <code>SampleProgram1</code> and <code>SampleProgram2</code> and a <code>bin</code> directory is created at the top level. If you build for <code>debug</code> and/or <code>profile</code>, subdirectories for those are created under <code>bin/</code> and <code>SampleProgram1(2)/obj</code>. |
− | **the library is built: statgen/lib/libStatGen.a
| |
− | *Use Makefile.include to build with the library.
| |
− | **See examples in statgen/src.
| |
| | | |
| | | |
− | = Creating programs =
| + | '''Using SampleTools as a starting point for your set of tools:''' |
− | If you are creating a program, you can start with Makefile.src found at statgen/src/.
| + | # Copy <code>SampleTools</code> into a directory with your toolset name (it is the starting point for your own set of tools). |
| + | # Update <code>ChangeLog</code>, <code>.gitignore</code>, and <code>README.txt</code> as appropriate. |
| + | # Add any necessary copyrights to the copyrights directory. |
| + | # Rename the <code>SampleProgram1</code> and <code>SampleProgram2</code> directories |
| + | # Create any additional directories as necessary. |
| + | #* Recursively copy the structure/Makefiles from <code>SampleProgram1</code>. |
| + | # Update <code>SUBDIRS</code> in <code>Makefile</code> as necessary. |
| + | # Update <code>Makefile.inc</code> |
| + | ## Update the <code>VERSION</code> as necessary. |
| + | ## Replace all occurrences of <code>SAMPLE_PROGRAM</code> with an all caps name for your toolset. |
| + | ##* You can then use the <code>LIB_PATH_<your toolset name></code> environment variable to specify an alternate path to libStatGen specific for your program. In most cases you will not need to do this. |
| + | #* No other updates to <code>Makefile.inc</code> should be necessary. |
| + | # For each Program you want to add: |
| + | ## Move into the appropriate subdirectory. |
| + | ##* No change should be made to the program's <code>Makefile</code> |
| + | ## Add your program (cpp & h files) to the <code>src</code> subdirectory. |
| + | ## Update src/Makefile |
| + | ### Set EXE to your program executable (replacing sampleProgram) |
| + | ### Set TOOLBASE, SRCONLY, and HDRONLY as appropriate for specifying your program file names. |
| + | ### Set any of the other optional settings as specified in the sample makefile. |
| + | ##* No other changes should be necessary to src/Makefile. |
| + | ## Add your tests to the <code>test</code> directory. |
| + | ## Update test/Makefile as appropriate for specifying how to compile/run your tests. |
| | | |
− | (If you are creating a program in a directory outside of statgen/src, you may need additional modifications.)
| |
| | | |
− | '''Example: creating statgen/src/myprog/'''
| + | = How To Use the APIs = |
− | # cd statgen/src/myprog
| + | More coming soon, see: http://genome.sph.umich.edu/wiki/Sam_Library_Usage_Examples |
− | # ln -s ../Makefile.src Makefile
| |
− | # cp ../bam/Makefile.tool .
| |
− | # Update Makefile.tool for your specific program.
| |
− | #* You may need settings beyond were set in statgen/src/bam/Makefile.tool. See Makefile.src for what settings you can use.
| |
| | | |
| + | [[LibStatGen: ASP#API for Reading ASP Files| ASP APIs]] |
| | | |
− | == Recently Added Capabilities ==
| + | [[LibStatGen: VCF#API for Reading VCF Files| VCF APIs]] |
− | * [[SAM/BAM Convert Sequence|SAM/BAM support conversion between '=' and the base in a sequence]]
| |