Changes

From Genome Analysis Wiki
Jump to navigationJump to search
no edit summary
Line 1: Line 1: −
Here I showed an example using Goncalo's library (I assume he agreed me to do so).<br>
+
Here I showed an example using Goncalo's library (I assume he agreed me to do so).<br>  
   −
The purpose of this program is (1) to extract a range of text from a input file, and count there frequencies. (2) count which line have the same text as the first line<br>
+
The purpose of this program is (1) to extract a range of text from a input file, and count there frequencies. (2) count which line have the same text as the first line<br>  
   −
The code will open a file (specified by -h parameter), take the third field, obtain a range of text (range is specified by starting position using -f and ending position using -t).<br>
+
The code will open a file (specified by -h parameter), take the third field, obtain a range of text (range is specified by starting position using -f and ending position using -t).<br>  
    
<source lang="c">#include "StringArray.h"
 
<source lang="c">#include "StringArray.h"
Line 63: Line 63:     
   ifclose(input);
 
   ifclose(input);
   }</source>
+
   }</source>  
   −
A example input, say INPUT.txt is like:<br>
+
A example input, say INPUT.txt is like:<br>  
   −
WTCCC66061-&gt;WTCCC66061 HAPLO1 AGACTCTGATAGCGATAACC<br>WTCCC66061-&gt;WTCCC66061 HAPLO2 GGGTTCCGATGGCGATAACC<br>WTCCC66062-&gt;WTCCC66062 HAPLO1 AGACTCTGATGGCGCTAACC<br>WTCCC66062-&gt;WTCCC66062 HAPLO2 AGACTCTGATAGCGATGATC<br>WTCCC66063-&gt;WTCCC66063 HAPLO1 AGACTCTTATGGCGCTAGCC<br>WTCCC66063-&gt;WTCCC66063 HAPLO2 AGACTCTTATAGCGATAACC<br>WTCCC66064-&gt;WTCCC66064 HAPLO1 AGACTCTGATGGCGATAGCC<br>WTCCC66064-&gt;WTCCC66064 HAPLO2 AGACTCTGATGACGCTAGCC<br>WTCCC66065-&gt;WTCCC66065 HAPLO1 AGACTCTGATGGCGATAACC<br>WTCCC66065-&gt;WTCCC66065 HAPLO2 AGACTCTGATGGCGATAGCC<br>
+
WTCCC66061-&gt;WTCCC66061 HAPLO1 AGACTCTGATAGCGATAACC<br>WTCCC66061-&gt;WTCCC66061 HAPLO2 GGGTTCCGATGGCGATAACC<br>WTCCC66062-&gt;WTCCC66062 HAPLO1 AGACTCTGATGGCGCTAACC<br>WTCCC66062-&gt;WTCCC66062 HAPLO2 AGACTCTGATAGCGATGATC<br>WTCCC66063-&gt;WTCCC66063 HAPLO1 AGACTCTTATGGCGCTAGCC<br>WTCCC66063-&gt;WTCCC66063 HAPLO2 AGACTCTTATAGCGATAACC<br>WTCCC66064-&gt;WTCCC66064 HAPLO1 AGACTCTGATGGCGATAGCC<br>WTCCC66064-&gt;WTCCC66064 HAPLO2 AGACTCTGATGACGCTAGCC<br>WTCCC66065-&gt;WTCCC66065 HAPLO1 AGACTCTGATGGCGATAACC<br>WTCCC66065-&gt;WTCCC66065 HAPLO2 AGACTCTGATGGCGATAGCC<br>  
   −
And if we run "extractHaplo -h INPUT.txt -f 1 -t 3", and the ouput looks like<br>
+
And if we run "extractHaplo -h INPUT.txt -f 1 -t 3". It means we want to read INPUT.txt, get the text range from 1-3 (because of 0-indexed, actually it is from the second character to fourth character), count the pattern frequency, and also find out which line has the same text in range as the first line does. The ouput looks like<br>  
   −
The following parameters are in effect:<br> Haplotype File : INPUT.txt (-hname)<br> From Position : 1 (-f9999)<br> To Position : 3 (-t9999)
+
The following parameters are in effect:<br> Haplotype File&nbsp;: INPUT.txt (-hname)<br> From Position&nbsp;: 1 (-f9999)<br> To Position&nbsp;: 3 (-t9999)  
   −
Haplotype Counts<br>GGT 1<br>GAC 9<br>60 1<br>Haplotypes that match the first one<br>WTCCC66062-&gt;WTCCC66062 (3)<br>WTCCC66062-&gt;WTCCC66062 (4)<br>WTCCC66063-&gt;WTCCC66063 (5)<br>WTCCC66063-&gt;WTCCC66063 (6)<br>WTCCC66064-&gt;WTCCC66064 (7)<br>WTCCC66064-&gt;WTCCC66064 (8)<br>WTCCC66065-&gt;WTCCC66065 (9)<br>WTCCC66065-&gt;WTCCC66065 (10)<br><br>
+
Haplotype Counts<br>GGT 1<br>GAC 9<br>60 1<br>Haplotypes that match the first one<br>WTCCC66062-&gt;WTCCC66062 (3)<br>WTCCC66062-&gt;WTCCC66062 (4)<br>WTCCC66063-&gt;WTCCC66063 (5)<br>WTCCC66063-&gt;WTCCC66063 (6)<br>WTCCC66064-&gt;WTCCC66064 (7)<br>WTCCC66064-&gt;WTCCC66064 (8)<br>WTCCC66065-&gt;WTCCC66065 (9)<br>WTCCC66065-&gt;WTCCC66065 (10)<br><br>  
   −
*To read a file, use IFILE class, which is wrapper for read/write file. A particular useful thing is that it handle gzipped file transparently. Important functions are: ifopen(), ifclose().<br>
+
*To read a file, use IFILE class, which is wrapper for read/write file. A particular useful thing is that it handle gzipped file transparently. Important functions are: ifopen(), ifclose().<br>  
*To handle strings, we prefer to use String class. In this area, handling string is a versatile task. A String class encapsulate basic operations such as index[], append(+), equality(=), extract(Left, Right, Mid). String class can be seamlessly used with IFILE class to acces file. See the while loop in the example code and notice the function ReadLine().<br>
+
*To handle strings, we prefer to use String class. In this area, handling string is a versatile task. A String class encapsulate basic operations such as index[], append(+), equality(=), extract(Left, Right, Mid). String class can be seamlessly used with IFILE class to acces file. See the while loop in the example code and notice the function ReadLine().<br>  
*To tokenize a String class, we can use StringArray class. It has ReplaceToken() which will store each token field like an array.<br>
+
*To tokenize a String class, we can use StringArray class. It has ReplaceToken() which will store each token field like an array.<br>  
 
*To associate a String class to a integer type, there is a class named StringIntHash, important functions are IncrementCount(), Capacity() and SlotInUse().
 
*To associate a String class to a integer type, there is a class named StringIntHash, important functions are IncrementCount(), Capacity() and SlotInUse().
    
<br>
 
<br>
255

edits

Navigation menu