BamUtil: trimBam

From Genome Analysis Wiki
Jump to navigationJump to search

Overview of the trimBam function of bamUtil

The trimBam option on the bamUtil executable trims the end of reads in a SAM/BAM file, changing read ends to ‘N’ and quality to ‘!’.


Usage

./bam trimBam [inFile] [outFile] [num-bases-to-trim-on-each-side]

Version 1.0.6 and later:

Alternately, the number of bases from each side can be specified (either or both -L/-R (--left/--right) can be specified):

./bam trimBam [inFile] [outFile] -L [num-bases-to-trim-from-left] -R [num-bases-to-trim-from-right]

By default reverse strands are reversed and then the left & right are trimmed.

This means that --left actually trims from the right of the read in the SAM/BAM for reverse reads.

Optionally --ignoreStrand/-i can be specified to ignore the strand information and treat forward/reverse the same.

trimBam will modify the sequences to 'N', and the quality string to '!'


Parameters

    Required Parameters:
        inFile  : the SAM/BAM file to be read
        outFile : the SAM/BAM file to be written
        num-bases-to-trim-on-each-side : the number of bases/qualities to trim from each side
    Instead of num-bases-to-trim-on-each-side, -L/-R (or --left/--right) can be specified to indicate the number of bases to trim from the left/right (left/right are reversed for reverse strands)
    Optional Parameters:
        --ignoreStrand : ignore strand information - do not reverse left/right for reverse reads
	PhoneHome:
		--noPhoneHome       : disable PhoneHome (default enabled)
		--phoneHomeThinning : adjust the PhoneHome thinning parameter (default 50)

Required Parameters

Input File (1st argument)

The 1st argument is the name of the input SAM/BAM file.

The program automatically determines if your input file is SAM/BAM/uncompressed BAM without any input other than a filename from the user, unless your input file is stdin.

A - is used to indicate to read from stdin and the extension is used to determine the file type (no extension indicates SAM).

SAM/BAM/Uncompressed BAM from file yourFileName
SAM from stdin -
BAM from stdin -.bam
Uncompressed BAM from stdin -.ubam


Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools implementation so pipes between our tools and samtools are supported.

Output File (2nd argument)

The 2nd argument is the name of the output SAM/BAM file.

The file extension is used to determine whether to write SAM/BAM/uncompressed BAM. A - is used to indicate stdout and the extension for file type (no extension is SAM).

SAM to file yourFileName.sam
BAM to file yourFileName.bam
Uncompressed BAM to file yourFileName.ubam
SAM to stdout -
BAM to stdout -.bam
Uncompressed BAM to stdout -.ubam


Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools implementation so pipes between our tools and samtools are supported.

Optional parameters

Number of Bases to Trim from Each End (3rd argument)

If the 3rd argument a number (with no flag/option), it is the number of bases to trim from each end of the reads.

Trim Bases from the Left (--left or --L)

Use --left or --L followed by the number of bases to be trimmed from the left.

By default reverse strands are reversed and then the left is trimmed, meaning that --left actually trims from the right of the read in the SAM/BAM for reverse reads.

Use --ignoreStrand/-i to ignore the strand information and treat forward/reverse the same.

Trim Bases from the Right (--right or --R)

Use --right or --R followed by the number of bases to be trimmed from the right.

By default reverse strands are reversed and then the right is trimmed, meaning that --right actually trims from the left of the read in the SAM/BAM for reverse reads.

Use --ignoreStrand/-i to ignore the strand information and treat forward/reverse the same.

Ignore the Strand when Trimming (--ignoreStrand or --i)

Use --ignoreStrand or --i to ignore the strand information and treat forward/reverese the same. When --ignoreStrand or --i is set, do not reverse reverse reads prior to trimming left/right.

Do not require BGZF EOF block (--noeof)

Use --noeof if you do not expect a trailing eof block in your bgzf file.

By default, the trailing empty block is expected and checked for.

PhoneHome Parameters

See PhoneHome for more information on how PhoneHome works and what it does.

Turn off PhoneHome (--noPhoneHome)

Use the --noPhoneHome option to completely disable PhoneHome. PhoneHome is enabled by default based on the thinning parameter.

Adjust the Frequency of PhoneHome (--phoneHomeThinning)

Use --phoneHomeThinning to modify the percentage of the time that PhoneHome will run (0-100).

  • By default, --phoneHomeThinning is set to 50, running 50% of the time.
  • PhoneHome will only occur if the run's random number modulo 100 is less than the --phoneHomeThinning value.
  • N/A if --noPhoneHome is set.

Return Value

Returns the SamStatus for the reads/writes. 0 on success, non-0 on failure.

Examples

Trim the same number of bases from each side

Example Input, trimming 2 bases:

./bin trimBam testFiles/testSam.sam results/trimSam.sam 2

Example Output:

Arguments in effect: 
	Input file : testFiles/testSam.sam
	Output file : results/trimSam.sam
	#Bases to trim from each side : 2

Number of records read = 10
Number of records written = 10


Trim different bases from each side, but treat reverse strands the opposite

Example Input, trimming 1 base from the left and 2 bases from the right for forward strands and do the opposite for reverse strands:

./bin trimBam testFiles/testSam.sam results/trimSam.sam -L 1 -R 2

Example Output:

Arguments in effect: 
	Input file : testFiles/testSam.sam
	Output file : results/trimSam.sam
	#Bases to trim from the left of forward strands : 1
	#Bases to trim from the right of forward strands: 2
	#Bases to trim from the left of reverse strands : 2
	#Bases to trim from the right of reverse strands : 1

Number of records read = 10
Number of records written = 10


Trim different bases from each side, but treat forward & reverse the same

Example Input, trimming 1 base from the left and 2 bases from the right ignoring strand information:

./bin trimBam testFiles/testSam.sam results/trimSam.sam -L 1 -R 2 --ignoreStrand

Example Output:

Arguments in effect: 
	Input file : testFiles/testSam.sam
	Output file : results/trimSam.sam
	#Bases to trim from the left of forward strands : 1
	#Bases to trim from the right of forward strands: 2
	#Bases to trim from the left of reverse strands : 1
	#Bases to trim from the right of reverse strands : 2

Number of records read = 10
Number of records written = 10