SplitPed
This page documents the splitPed program, which splits a pedigree file into smaller files with subsets of markers.
Input Files
Required Input Files
Pedigree file (.ped)
File fed to -ped option, in Merlin format pedigree file. For details of the Merlin file format, see the Merlin tutorial [1].
Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele).
Marker information (.dat) file
File fed to -dat option, in Merlin format marker information file. For details of the Merlin file format, see the Merlin tutorial [2].
Optional Input Files
Map file
File fed to -map option, containing chromosome, marker name, and marker coordinate (in base pairs) information for each marker. Markers should be stored in the same order as in the marker information file.
Options
Required options
window size
Window size can be specified by one of the following three options: (1) -nWindows (2) -windowSize and (3) -windowLength.
-nWindows specifies the number of windows to split into and the program splits markers evenly into output windows.
-windowSize specifies the number of markers in one output window. The remainder goes to the last window.
-windowLength specifies the length (in base pairs) of one output window. The remainder goes to the last window. Note that this option is only allowed when map input file is specified.
flanking region
Size of flanking region on each side can be specified by one of the following two options: (1) -overlapSize and (2) -overlapLength.
-overlapSize specifies the number of markers in each flanking region (so that the total number of flanking markers for each window is twice the number specified except for the first and last window).
-overlapLength specifies the length (in base pairs) of each flanking region (so that the total length of the flanking regions is twice the number specified except for the first and last window).
Output prefix
Specified by -o option.
Additional options
Split original pedigree file?
This is controlled by -splitPed option. By default, all the output marker information (.dat) files share the same input pedigree (.ped) file and NO output pedigree (.ped) file is generated.
If one wants separate .ped and .dat files for each output window, use "-splitPed 1".
Estimate window size only
This is controlled by -extimateWindowOnly option. By default, splitting is performed.
But if one only wishes to peek into how the markers are allocated into output windows, use "-extimateWindowOnly 1".
Example Commands
splitPed.pl -ped example.ped -dat example.dat -map example.map -windowLength 10000000 -overlapLength 1000000 -o split splitPed.pl -ped example.ped -dat example.dat -map example.map -windowLength 10000000 -overlapLength 1000000 -splitPed 1 -o split.with_ped
Download
You can download splitPed at splitPed Download Page.
Questions and Comments?
Email Yun Li.