Changes

From Genome Analysis Wiki
Jump to navigationJump to search
no edit summary
Line 1: Line 1:  
Here I showed an example using Goncalo's library (I assume he agreed me to do so).<br>  
 
Here I showed an example using Goncalo's library (I assume he agreed me to do so).<br>  
 +
 +
    
The purpose of this program is (1) to extract a range of text from a input file, and count there frequencies. (2) count which line have the same text as the first line<br>  
 
The purpose of this program is (1) to extract a range of text from a input file, and count there frequencies. (2) count which line have the same text as the first line<br>  
Line 66: Line 68:  
   }</source>  
 
   }</source>  
   −
A example input, say INPUT.txt is like:<br>  
+
 
 +
 
 +
A example input, say INPUT.txt, is like:<br>  
    
WTCCC66061-&gt;WTCCC66061 HAPLO1 AGACTCTGATAGCGATAACC<br>WTCCC66061-&gt;WTCCC66061 HAPLO2 GGGTTCCGATGGCGATAACC<br>WTCCC66062-&gt;WTCCC66062 HAPLO1 AGACTCTGATGGCGCTAACC<br>WTCCC66062-&gt;WTCCC66062 HAPLO2 AGACTCTGATAGCGATGATC<br>WTCCC66063-&gt;WTCCC66063 HAPLO1 AGACTCTTATGGCGCTAGCC<br>WTCCC66063-&gt;WTCCC66063 HAPLO2 AGACTCTTATAGCGATAACC<br>WTCCC66064-&gt;WTCCC66064 HAPLO1 AGACTCTGATGGCGATAGCC<br>WTCCC66064-&gt;WTCCC66064 HAPLO2 AGACTCTGATGACGCTAGCC<br>WTCCC66065-&gt;WTCCC66065 HAPLO1 AGACTCTGATGGCGATAACC<br>WTCCC66065-&gt;WTCCC66065 HAPLO2 AGACTCTGATGGCGATAGCC<br>  
 
WTCCC66061-&gt;WTCCC66061 HAPLO1 AGACTCTGATAGCGATAACC<br>WTCCC66061-&gt;WTCCC66061 HAPLO2 GGGTTCCGATGGCGATAACC<br>WTCCC66062-&gt;WTCCC66062 HAPLO1 AGACTCTGATGGCGCTAACC<br>WTCCC66062-&gt;WTCCC66062 HAPLO2 AGACTCTGATAGCGATGATC<br>WTCCC66063-&gt;WTCCC66063 HAPLO1 AGACTCTTATGGCGCTAGCC<br>WTCCC66063-&gt;WTCCC66063 HAPLO2 AGACTCTTATAGCGATAACC<br>WTCCC66064-&gt;WTCCC66064 HAPLO1 AGACTCTGATGGCGATAGCC<br>WTCCC66064-&gt;WTCCC66064 HAPLO2 AGACTCTGATGACGCTAGCC<br>WTCCC66065-&gt;WTCCC66065 HAPLO1 AGACTCTGATGGCGATAACC<br>WTCCC66065-&gt;WTCCC66065 HAPLO2 AGACTCTGATGGCGATAGCC<br>  
   −
And if we run "extractHaplo -h INPUT.txt -f 1 -t 3". It means we want to read INPUT.txt, get the text range from 1-3 (because of 0-indexed, actually it is from the second character to fourth character), count the pattern frequency, and also find out which line has the same text in range as the first line does. The ouput looks like<br>  
+
 
 +
 
 +
My way to compile this source code into executable file is:
 +
 
 +
g++ -g -o Main Main.cpp libcsg/libcsg.a -I libcsg -lz<br>
 +
 
 +
where "libcsg" refers to all source code checked out from repository and contains files including "StringArray.h" and etc.
 +
 
 +
 
 +
 
 +
And if we run "./Main -h INPUT.txt -f 1 -t 3". It means we want to read INPUT.txt, get the text range from 1-3 (because of 0-indexed, actually it is from the second character to fourth character), count the pattern frequency, and also find out which line has the same text in range as the first line does. The ouput looks like<br>  
    
The following parameters are in effect:<br> Haplotype File&nbsp;: INPUT.txt (-hname)<br> From Position&nbsp;: 1 (-f9999)<br> To Position&nbsp;: 3 (-t9999)  
 
The following parameters are in effect:<br> Haplotype File&nbsp;: INPUT.txt (-hname)<br> From Position&nbsp;: 1 (-f9999)<br> To Position&nbsp;: 3 (-t9999)  
    
Haplotype Counts<br>GGT 1<br>GAC 9<br>Haplotypes that match the first one<br>WTCCC66062-&gt;WTCCC66062 (3)<br>WTCCC66062-&gt;WTCCC66062 (4)<br>WTCCC66063-&gt;WTCCC66063 (5)<br>WTCCC66063-&gt;WTCCC66063 (6)<br>WTCCC66064-&gt;WTCCC66064 (7)<br>WTCCC66064-&gt;WTCCC66064 (8)<br>WTCCC66065-&gt;WTCCC66065 (9)<br>WTCCC66065-&gt;WTCCC66065 (10)<br><br>  
 
Haplotype Counts<br>GGT 1<br>GAC 9<br>Haplotypes that match the first one<br>WTCCC66062-&gt;WTCCC66062 (3)<br>WTCCC66062-&gt;WTCCC66062 (4)<br>WTCCC66063-&gt;WTCCC66063 (5)<br>WTCCC66063-&gt;WTCCC66063 (6)<br>WTCCC66064-&gt;WTCCC66064 (7)<br>WTCCC66064-&gt;WTCCC66064 (8)<br>WTCCC66065-&gt;WTCCC66065 (9)<br>WTCCC66065-&gt;WTCCC66065 (10)<br><br>  
 +
 +
Some Notes
    
*To read a file, use IFILE class, which is wrapper for read/write file. A particular useful thing is that it handle gzipped file transparently. Important functions are: ifopen(), ifclose().<br>  
 
*To read a file, use IFILE class, which is wrapper for read/write file. A particular useful thing is that it handle gzipped file transparently. Important functions are: ifopen(), ifclose().<br>  
255

edits

Navigation menu