BamUtil: writeRegion
Overview of the writeRegion
function of bamUtil
The writeRegion
option on the bamUtil executable uses an indexed BAM file to only write the alignments that:
- fall within the region specified
- region as defined by
refID
orrefName
andstart
and/orend
- region as defined in the
bed
file - overlapping or fully within if
--withinReg
is specified
- region as defined by
- have a specific read name (if specified)
Usage
./bam writeRegion --in <inputFilename> --out <outputFilename> [--bamIndex <bamIndexFile>] [--refName <reference Name> | --refID <reference ID>] [--start <0-based start pos>] [--end <0-based end psoition>] [--bed <bed filename>] [--withinRegion] [--readName <readName>] [--lshift] [--params] [--noeof]
Parameters
Required Parameters: --in : the BAM file to be read --out : the SAM/BAM file to write to Optional Parameters for Specifying a Region: --bamIndex : the path/name of the bam index file (if not specified, uses the --in value + ".bai") --refName : the BAM reference Name to read Either this or refID can be specified. Defaults to all references. --refID : the BAM reference ID to read. Either this or refName can be specified. Defaults to all references. Specify -1 for unmapped --start : inclusive 0-based start position. Defaults to -1: meaning from the start of the reference. Only applicable if refName/refID is set. --end : exclusive 0-based end position. Defaults to -1: meaning til the end of the reference. Only applicable if refName/refID is set. --bed : use the specified bed file for regions. --withinReg : only print reads fully enclosed within the region. --readName : only print reads with this read name. Optional Parameters For Other Operations: --lshift : left shift indels when writing records --excludeFlags : Skip any records with any of the specified flags set (specify an integer representation of the flags) --requiredFlags : Only process records with all of the specified flags set (specify an integer representation of the flags) --params : print the parameter settings --noeof : do not expect an EOF block on a bam file.
PhoneHome: --noPhoneHome : disable PhoneHome (default enabled) --phoneHomeThinning : adjust the PhoneHome thinning parameter (default 50)
Required Parameters
Input File (--in
)
Use --in
followed by your file name to specify the SAM/BAM input file.
The program automatically determines if your input file is SAM/BAM/uncompressed BAM without any input other than a filename from the user, unless your input file is stdin.
A -
is used to indicate to read from stdin and the extension is used to determine the file type (no extension indicates SAM).
SAM/BAM/Uncompressed BAM from file | --in yourFileName
|
SAM from stdin | --in - |
BAM from stdin | --in -.bam |
Uncompressed BAM from stdin | --in -.ubam |
Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools
implementation so pipes between our tools and samtools
are supported.
Output File (--out
)
Use --out
followed by your file name to specify the SAM/BAM output file.
The file extension is used to determine whether to write SAM/BAM/uncompressed BAM. A -
is used to indicate stdout and the extension for file type (no extension is SAM).
SAM to file | --out yourFileName.sam
|
BAM to file | --out yourFileName.bam
|
Uncompressed BAM to file | --out yourFileName.ubam
|
SAM to stdout | --out -
|
BAM to stdout | --out -.bam
|
Uncompressed BAM to stdout | --out -.ubam
|
Note: Uncompressed BAM is compressed using compression level-0 (so it is not an entirely uncompressed file). This matches the samtools
implementation so pipes between our tools and samtools
are supported.
Optional Region Specifying Parameters
Bam Index File (--bamIndex
)
Use --bamIndex
followed by your file name to specify the BAM index file to use for reading the BAM file.
If this file is required but not specified, it will use the input file name + ".bai".
Read only a Specific Reference/Chromosome (--refName
or --refID
)
If you only want to read a specific reference (chromosome), specify either --refName
followed by the reference name or --refID followed by the reference id.
If you want to read all references, don't specify either --refName
or --refID
.
The reference Name is the name specified in the RNAME
field of the records in the SAM file or in the name
fields of the reference information section of the BAM file.
The reference ID is the value specified in the refID
field of the records in the BAM file.
If you want to read only unmapped reads, use --refID -1
Read only a Specific Region of a Chromosome (--start
and --end
)
You can only specify a specific region if you also specify a specific reference/chromosome using --refName
or --refID
.
Use --start
to specify the inclusive 0-based start position of the region you want to read. Specify --start -1
to specify start at the beginning of the specified chromosome.
Use --end
to specify the exclusive 0-based end position of the region you want to read. Specify --end -1
to specify end of the specified chromosome.
Bed File with Regions to Write (--bed
)
If --bed
followed by a filename is specified the regions specified in the bed file will be written.
It is assumed that the regions in the bed file are sorted.
Only Write Reads Fully within the Specified Region (--withinReg
)
By default reads that overlap the specified region are written. If instead you only want to write reads that are fully within the specified regions, use the --withinReg
option.
Only Print Reads with a Specified Read Name (--readName
)
If you only want to print reads with a specific read name, use the --readName
option followed by the read name.
Optional Parameters For Other Operations
Left Shift Indels in the CIGAR (--lshift
)
Left shift indels as far as they can go in the read.
Skip Records with any of the Specified Flags (--excludeFlags
)
Use --excludeFlags
followed by the flags (as one integer) to skip any records that has any of the specified flags set.
This parameter was added in version 1.0.10.
Only Process Records with the all of the Specified Flags (--requiredFlags
)
Use --requiredFlags
followed by the flags (as one integer) to only process records with all of the specified flags set.
This parameter was added in version 1.0.10.
Print the Program Parameters (--params
)
Use --params
to print the parameters for your program to stderr.
Do not require BGZF EOF block (--noeof
)
Use --noeof
if you do not expect a trailing eof block in your bgzf file.
By default, the trailing empty block is expected and checked for.
PhoneHome Parameters
See PhoneHome for more information on how PhoneHome works and what it does.
Turn off PhoneHome (--noPhoneHome
)
Use the --noPhoneHome
option to completely disable PhoneHome. PhoneHome is enabled by default based on the thinning parameter.
Adjust the Frequency of PhoneHome (--phoneHomeThinning
)
Use --phoneHomeThinning
to modify the percentage of the time that PhoneHome will run (0-100).
- By default,
--phoneHomeThinning
is set to 50, running 50% of the time. - PhoneHome will only occur if the run's random number modulo 100 is less than the --phoneHomeThinning value.
- N/A if
--noPhoneHome
is set.
Return Value
- 0: all records are successfully read and written.
- non-0: at least one record was not successfully read or written.
Example Output
Wrote t.sam with 2 records.