From Genome Analysis Wiki
Jump to navigationJump to search
739 bytes added
, 10:27, 24 July 2012
Line 101: |
Line 101: |
| | | |
| ==== Specifying Discard Rules ==== | | ==== Specifying Discard Rules ==== |
| + | ===== Basic Rules ===== |
| When specifying the discard rules, you should use the constants found at the top of VcfFileReader.h | | When specifying the discard rules, you should use the constants found at the top of VcfFileReader.h |
| | | |
Line 122: |
Line 123: |
| </source> | | </source> |
| and discards reads that do not have <code>PASS</code> in the <code>FILTER</code> field and reads that have a genotype that is not phased or have no <code>GT</code> in the <code>FORMAT</code> fields. | | and discards reads that do not have <code>PASS</code> in the <code>FILTER</code> field and reads that have a genotype that is not phased or have no <code>GT</code> in the <code>FORMAT</code> fields. |
| + | |
| + | ===== Additional Rules ===== |
| + | There are additional discard rules that can be specified by calling methods on VcfFileReader. |
| + | |
| + | To Discard any records without a minimum number of alternate alleles, use: |
| + | <source lang="cpp"> |
| + | VcfFileReader::addDiscardMinAltAlleleCount(int32_t minAltAlleleCount, VcfSubsetSamples* subset) |
| + | </source> |
| + | |
| + | The <code>VcfSubsetSamples* subset</code> parameter is a pointer to the subset of samples that you want to include when counting the number of alternate alleles. If all samples that are read/kept are to be included, NULL should be passed in. |
| + | |
| + | The <code>minAltAlleleCount</code> parameter is the minimum number of alternate alleles found in the subset in order for the record to be kept. |
| | | |
| ==== Read only Certain Sections of the File / Using a VCF Index (TABIX) File ==== | | ==== Read only Certain Sections of the File / Using a VCF Index (TABIX) File ==== |