Difference between revisions of "GotCloud: Filters"
From Genome Analysis Wiki
Jump to navigationJump to searchLine 9: | Line 9: | ||
== Hard Filters == | == Hard Filters == | ||
− | + | In GotCloud: | |
{|border="1" cellspacing="0" cellpadding="2" | {|border="1" cellspacing="0" cellpadding="2" | ||
! Filter Prefix !! Filter Description !! Filter if !! Default Value | ! Filter Prefix !! Filter Description !! Filter if !! Default Value | ||
Line 35: | Line 35: | ||
| STZ ||Strand Bias z-score || INFO:STZ > FILTER_MAX_STZ < INT_MAX || 5,10 | | STZ ||Strand Bias z-score || INFO:STZ > FILTER_MAX_STZ < INT_MAX || 5,10 | ||
|- | |- | ||
− | | fic || INFO:FIC < FILTER_MIN_FIC/100. > INT_MIN || -20,-10 | + | | fic || || INFO:FIC < FILTER_MIN_FIC/100. > INT_MIN || -20,-10 |
|- | |- | ||
| CBR || Cycle Bias Peason's correlation || INFO:CBR > FILTER_MAX_CBR/100. < 1 || 20,10 | | CBR || Cycle Bias Peason's correlation || INFO:CBR > FILTER_MAX_CBR/100. < 1 || 20,10 | ||
|- | |- | ||
− | | LQR || INFO:LQR > FILTER_MAX_LQR/100. < 1 || 30,20 | + | | LQR || || INFO:LQR > FILTER_MAX_LQR/100. < 1 || 30,20 |
|- | |- | ||
| AOI || Alternate allele inflation score || INFO:AOI > FILTER_MAX_AOI < INT_MAX || 5 | | AOI || Alternate allele inflation score || INFO:AOI > FILTER_MAX_AOI < INT_MAX || 5 | ||
Line 45: | Line 45: | ||
| MQ0_ || Fraction of bases with mapQ=0 || INFO:MQ0 > FILTER_MAX_MQ0/100. < 1 || 10 | | MQ0_ || Fraction of bases with mapQ=0 || INFO:MQ0 > FILTER_MAX_MQ0/100. < 1 || 10 | ||
|- | |- | ||
− | | IOR || Ratio of base-quality inflation || INFO:IOR | + | | IOR || Ratio of base-quality inflation || INFO:IOR > FILTER_MAX_IOR/100. < INT_MAX/100. || off |
|- | |- | ||
| AOZ || Alternate allele quality z-score || INFO:AOZ > FILTER_MAX_AOZ < INT_MAX || off | | AOZ || Alternate allele quality z-score || INFO:AOZ > FILTER_MAX_AOZ < INT_MAX || off | ||
− | |- | + | |} |
+ | |||
+ | To remove a filter, set it to blank or "off" in your user configuration file | ||
+ | |||
+ | The values of these filters must be numbers (or comma/space separated list of numbers | ||
+ | |||
+ | These rules apply to the following filters: | ||
+ | * Specifying 1 value in the filter will turn that filter on and use that value. | ||
+ | * Specifying 2 values in the filter (separated by ',' and/or ' ') turns on the filter. | ||
+ | ** Use the 1st value if the number of samples is below FILTER_FORMULA_MIN_SAMPLES | ||
+ | ** Use the 2nd value if the number of samples is above FILTER_FORMULA_MAX_SAMPLES | ||
+ | ** If the number of samples is between the MIN & MAX, a logscale is used: | ||
+ | <pre> (minVal - maxVal) * (log(maxSamples) - log(numSamples)) / (log(maxSamples) - log(minSamples)) + maxVal | ||
+ | </pre> | ||
+ | with: | ||
+ | |||
+ | FILTER_FORMULA_MIN_SAMPLES = 100 | ||
+ | |||
+ | FILTER_FORMULA_MAX_SAMPLES = 1000 | ||
+ | |||
+ | To add additional filters, set FILTER_ADDITIONAL with the --min/max specified below and the appropriate value, like: | ||
+ | <pre>FILTER_ADDITIONAL = --maxSTP 5 --minLQZ 5</pre> | ||
− | | FFRQ || || winFFRQ, maxFFRQ both > 0 || | + | {|border="1" cellspacing="0" cellpadding="2" |
+ | ! Filter Prefix !! Filter Description !! Filter if !! Default Value | ||
+ | |- | ||
+ | | FFRQ || || TBD winFFRQ, maxFFRQ both > 0 || | ||
|- | |- | ||
− | | STP || INFO:STP | + | | STP || || INFO:STP > maxSTP < INT_MAX || |
|- | |- | ||
− | | TTT || INFO:TTT | + | | TTT || || INFO:TTT > maxTTT < INT_MAX || |
|- | |- | ||
− | | ttt || INFO:TTT | + | | ttt || || INFO:TTT < minTTT > INT_MIN || |
|- | |- | ||
− | | LQZ || INFO:LQZ | + | | LQZ || || INFO:LQZ > maxLQZ < INT_MAX || |
|- | |- | ||
− | | lqz || INFO:LQZ | + | | lqz || || INFO:LQZ < minLQZ > INT_MIN || |
|- | |- | ||
− | | RBZ || INFO:RBZ | + | | RBZ || || INFO:RBZ > maxRBZ < INT_MAX || |
|- | |- | ||
− | | rbz || INFO:RBZ | + | | rbz || || INFO:RBZ < minRBZ > INT_MIN || |
|- | |- | ||
| CBZ || Cycle Bias z-score || INFO:CBZ > maxCBZ < INT_MAX || | | CBZ || Cycle Bias z-score || INFO:CBZ > maxCBZ < INT_MAX || | ||
|- | |- | ||
− | | cbr || INFO:CBR | + | | cbr || || INFO:CBR < minCBR/100. > -1 || |
|- | |- | ||
− | | QBR || INFO:QBR | + | | QBR || || INFO:QBR > maxQBR/100. < 1 || |
|- | |- | ||
− | | qbr || INFO:QBR | + | | qbr || || INFO:QBR < minQBR/100. > -1 || |
|- | |- | ||
− | | CSR || INFO:CSR | + | | CSR || Cycle-Strand Peason's Correlation || INFO:CSR > maxCSR/100. < 1 || |
|- | |- | ||
− | | csr || INFO:CSR | + | | csr || Cycle-Strand Peason's Correlation || INFO:CSR < minCSR/100. > -1 || |
|- | |- | ||
− | | IOZ || INFO:IOZ | + | | IOZ || Base quality inflation z-score || INFO:IOZ > maxIOZ < INT_MAX || |
|- | |- | ||
− | | ior || INFO:IOR | + | | ior || Ratio of base-quality inflation || INFO:IOR < minIOR/100. > INT_MIN/100. || |
|- | |- | ||
− | | MQ10_ || INFO:MQ10 | + | | MQ10_ || Fraction of bases with mapQ<=10 || INFO:MQ10 > maxMQ10/100. < 1 || |
|- | |- | ||
− | | MQ20_ || INFO:MQ20 | + | | MQ20_ || Fraction of bases with mapQ<=20 || INFO:MQ20 > maxMQ20/100. < 1 || |
|- | |- | ||
− | | ABE || INFO:ABE | + | | ABE || || INFO:ABE > maxABE/100. < 1 || |
|- | |- | ||
− | | abe || INFO:ABE | + | | abe || || INFO:ABE < minABE/100. > -1 || |
|- | |- | ||
− | | MBR || INFO:MBR | + | | MBR || || INFO:MBR > maxMBR/100. < 1 || |
|- | |- | ||
− | | mbr || INFO:MBR | + | | mbr || || INFO:MBR < minMBR/100. > -1 || |
|- | |- | ||
− | | ABZ || INFO:ABZ | + | | ABZ || || INFO:ABZ > maxABZ < INT_MAX || |
|- | |- | ||
− | | abz || INFO:ABZ | + | | abz || || INFO:ABZ < minABZ > INT_MIN || |
|- | |- | ||
− | | BCS || INFO:BCS | + | | BCS || || INFO:BCS > maxBCS < INT_MAX || |
|} | |} |
Revision as of 21:33, 27 August 2013
GotCloud Filters
GotCloud uses multiple filters throughout the GotCloud Pipeline.
The primary filters applied are during the Variant Calling snpcall step.
There are two phases of filters:
- Hard Filters
- SVM Filters
Hard Filters
In GotCloud:
Filter Prefix | Filter Description | Filter if | Default Value |
---|---|---|---|
q | pred-scaled quality score | QUAL < FILTER_MIN_QUAL > 0 | 5 |
INDEL | FILTER_WIN_INDEL > 0 | 5 | |
m | Root Mean Squared Mapping Quality | INFO:MQ < FILTER_MIN_MQ > 0 | 20 |
dp | Total Depth at Site | INFO:DP < #samples * FILTER_MIN_SAMPLE_DP > 0 | #samples*1 |
DP | Total Depth at Site | INFO:DP > #samples * FILTER_MAX_SAMPLE_DP < INT_MAX | #samples*1000 |
ns | Number of Samples With Coverage | INFO:NS < (FILTER_MIN_NS or FILTER_MIN_NS_FRAC * #samples) > 0 | .5*#samples |
AB | Allele Balance in Heterozygotes | INFO:AB > FILTER_MAX_ABL/100. < 1 | 70,65 |
STR | Strand Bias Pearson's Correlation | INFO:STR > FILTER_MAX_STR/100. < 1 | 20,10 |
str | Strand Bias Pearson's Correlation | INFO:STR < FILTER_MIN_STR/100. > -1 | -20,-10 |
stz | Strand Bias z-score | INFO:STZ < FILTER_MIN_STZ > INT_MIN | -5,-10 |
STZ | Strand Bias z-score | INFO:STZ > FILTER_MAX_STZ < INT_MAX | 5,10 |
fic | INFO:FIC < FILTER_MIN_FIC/100. > INT_MIN | -20,-10 | |
CBR | Cycle Bias Peason's correlation | INFO:CBR > FILTER_MAX_CBR/100. < 1 | 20,10 |
LQR | INFO:LQR > FILTER_MAX_LQR/100. < 1 | 30,20 | |
AOI | Alternate allele inflation score | INFO:AOI > FILTER_MAX_AOI < INT_MAX | 5 |
MQ0_ | Fraction of bases with mapQ=0 | INFO:MQ0 > FILTER_MAX_MQ0/100. < 1 | 10 |
IOR | Ratio of base-quality inflation | INFO:IOR > FILTER_MAX_IOR/100. < INT_MAX/100. | off |
AOZ | Alternate allele quality z-score | INFO:AOZ > FILTER_MAX_AOZ < INT_MAX | off |
To remove a filter, set it to blank or "off" in your user configuration file
The values of these filters must be numbers (or comma/space separated list of numbers
These rules apply to the following filters:
- Specifying 1 value in the filter will turn that filter on and use that value.
- Specifying 2 values in the filter (separated by ',' and/or ' ') turns on the filter.
- Use the 1st value if the number of samples is below FILTER_FORMULA_MIN_SAMPLES
- Use the 2nd value if the number of samples is above FILTER_FORMULA_MAX_SAMPLES
- If the number of samples is between the MIN & MAX, a logscale is used:
(minVal - maxVal) * (log(maxSamples) - log(numSamples)) / (log(maxSamples) - log(minSamples)) + maxVal
with:
FILTER_FORMULA_MIN_SAMPLES = 100
FILTER_FORMULA_MAX_SAMPLES = 1000
To add additional filters, set FILTER_ADDITIONAL with the --min/max specified below and the appropriate value, like:
FILTER_ADDITIONAL = --maxSTP 5 --minLQZ 5
Filter Prefix | Filter Description | Filter if | Default Value |
---|---|---|---|
FFRQ | TBD winFFRQ, maxFFRQ both > 0 | ||
STP | INFO:STP > maxSTP < INT_MAX | ||
TTT | INFO:TTT > maxTTT < INT_MAX | ||
ttt | INFO:TTT < minTTT > INT_MIN | ||
LQZ | INFO:LQZ > maxLQZ < INT_MAX | ||
lqz | INFO:LQZ < minLQZ > INT_MIN | ||
RBZ | INFO:RBZ > maxRBZ < INT_MAX | ||
rbz | INFO:RBZ < minRBZ > INT_MIN | ||
CBZ | Cycle Bias z-score | INFO:CBZ > maxCBZ < INT_MAX | |
cbr | INFO:CBR < minCBR/100. > -1 | ||
QBR | INFO:QBR > maxQBR/100. < 1 | ||
qbr | INFO:QBR < minQBR/100. > -1 | ||
CSR | Cycle-Strand Peason's Correlation | INFO:CSR > maxCSR/100. < 1 | |
csr | Cycle-Strand Peason's Correlation | INFO:CSR < minCSR/100. > -1 | |
IOZ | Base quality inflation z-score | INFO:IOZ > maxIOZ < INT_MAX | |
ior | Ratio of base-quality inflation | INFO:IOR < minIOR/100. > INT_MIN/100. | |
MQ10_ | Fraction of bases with mapQ<=10 | INFO:MQ10 > maxMQ10/100. < 1 | |
MQ20_ | Fraction of bases with mapQ<=20 | INFO:MQ20 > maxMQ20/100. < 1 | |
ABE | INFO:ABE > maxABE/100. < 1 | ||
abe | INFO:ABE < minABE/100. > -1 | ||
MBR | INFO:MBR > maxMBR/100. < 1 | ||
mbr | INFO:MBR < minMBR/100. > -1 | ||
ABZ | INFO:ABZ > maxABZ < INT_MAX | ||
abz | INFO:ABZ < minABZ > INT_MIN | ||
BCS | INFO:BCS > maxBCS < INT_MAX |