SplitRef

From Genome Analysis Wiki
Revision as of 11:34, 2 February 2017 by Ppwhite (talk | contribs) (→‎Download)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

This page documents the splitRef program, which splits a reference haplotype file into smaller files with subsets of markers.

Input Files

Required Input Files

Haplotype file (.hap)

File fed to -hap option. One line for one haplotype, with last field containing the actual alleles with no separators between alleles.

Marker list (.snps) file

File fed to -snps option. One line for each marker: marker name only.

Optional Input Files

Map file

File fed to -map option, containing chromosome, marker name, and marker coordinate (in base pairs) information for each marker. Markers should be stored in the same order as in the marker information file.

Options

Required options

window size

Window size can be specified by one of the following three options: (1) -nWindows (2) -windowSize and (3) -windowLength.
-nWindows specifies the number of windows to split into and the program splits markers evenly into output windows.
-windowSize specifies the number of markers in one output window. The remainder goes to the last window.
-windowLength specifies the length (in base pairs) of one output window. The remainder goes to the last window. Note that this option is only allowed when map input file is specified.

flanking region

Size of flanking region on each side can be specified by one of the following two options: (1) -overlapSize and (2) -overlapLength.
-overlapSize specifies the number of markers in each flanking region (so that the total number of flanking markers for each window is twice the number specified except for the first and last window).
-overlapLength specifies the length (in base pairs) of each flanking region (so that the total length of the flanking regions is twice the number specified except for the first and last window).

Output prefix

Specified by -o option.

Additional options

Estimate window size only

This is controlled by -extimateWindowOnly option. By default, splitting is performed.
But if one only wishes to peek into how the markers are allocated into output windows, use "-extimateWindowOnly 1".

Example Commands

 splitRef.pl -hap example.hap.gz -snps example.snps -map example.map -windowLength 10000000 -overlapLength 1000000 
 splitRef.pl -hap example.hap.gz -snps example.snps -windowSize 10000 -overlapSize 1000 
 splitRef.pl -hap example.hap.gz -snps example.snps -nWindows 12 -overlapSize 1000 

Download

You can download splitPed at splitRef Download Page.

Questions and Comments?

Email Yun Li.