Difference between revisions of "GotCloud: Filters"
From Genome Analysis Wiki
Jump to navigationJump to searchLine 45: | Line 45: | ||
| MQ0_ || Fraction of bases with mapQ=0 || INFO:MQ0 > FILTER_MAX_MQ0/100. < 1 || 10 | | MQ0_ || Fraction of bases with mapQ=0 || INFO:MQ0 > FILTER_MAX_MQ0/100. < 1 || 10 | ||
|- | |- | ||
− | | IOR || Ratio of base-quality inflation || INFO:IOR > FILTER_MAX_IOR | + | | IOR || Ratio of base-quality inflation || INFO:IOR > FILTER_MAX_IOR < INT_MAX || off |
|- | |- | ||
| AOZ || Alternate allele quality z-score || INFO:AOZ > FILTER_MAX_AOZ < INT_MAX || off | | AOZ || Alternate allele quality z-score || INFO:AOZ > FILTER_MAX_AOZ < INT_MAX || off |
Revision as of 10:13, 29 October 2014
GotCloud Filters
GotCloud uses multiple filters throughout the GotCloud Pipeline.
The primary filters applied are during the Variant Calling snpcall step.
There are two phases of filters:
- Hard Filters
- SVM Filters
Hard Filters
In GotCloud:
Filter Prefix | Filter Description | Filter if | Default Value |
---|---|---|---|
q | pred-scaled quality score | QUAL < FILTER_MIN_QUAL > 0 | 5 |
INDEL | FILTER_WIN_INDEL > 0 | 5 | |
m | Root Mean Squared Mapping Quality | INFO:MQ < FILTER_MIN_MQ > 0 | 20 |
dp | Total Depth at Site | INFO:DP < #samples * FILTER_MIN_SAMPLE_DP > 0 | #samples*1 |
DP | Total Depth at Site | INFO:DP > #samples * FILTER_MAX_SAMPLE_DP < INT_MAX | #samples*1000 |
ns | Number of Samples With Coverage | INFO:NS < (FILTER_MIN_NS or FILTER_MIN_NS_FRAC * #samples) > 0 | .5*#samples |
AB | Allele Balance in Heterozygotes | INFO:AB > FILTER_MAX_ABL/100. < 1 | 70,65 |
STR | Strand Bias Pearson's Correlation | INFO:STR > FILTER_MAX_STR/100. < 1 | 20,10 |
str | Strand Bias Pearson's Correlation | INFO:STR < FILTER_MIN_STR/100. > -1 | -20,-10 |
stz | Strand Bias z-score | INFO:STZ < FILTER_MIN_STZ > INT_MIN | -5,-10 |
STZ | Strand Bias z-score | INFO:STZ > FILTER_MAX_STZ < INT_MAX | 5,10 |
fic | INFO:FIC < FILTER_MIN_FIC/100. > INT_MIN | -20,-10 | |
CBR | Cycle Bias Peason's correlation | INFO:CBR > FILTER_MAX_CBR/100. < 1 | 20,10 |
LQR | INFO:LQR > FILTER_MAX_LQR/100. < 1 | 30,20 | |
AOI | Alternate allele inflation score | INFO:AOI > FILTER_MAX_AOI < INT_MAX | 5 |
MQ0_ | Fraction of bases with mapQ=0 | INFO:MQ0 > FILTER_MAX_MQ0/100. < 1 | 10 |
IOR | Ratio of base-quality inflation | INFO:IOR > FILTER_MAX_IOR < INT_MAX | off |
AOZ | Alternate allele quality z-score | INFO:AOZ > FILTER_MAX_AOZ < INT_MAX | off |
To remove a filter, set it to blank or "off" in your user configuration file
The values of these filters must be numbers (or comma/space separated list of numbers
These rules apply to the following filters:
- Specifying 1 value in the filter will turn that filter on and use that value.
- Specifying 2 values in the filter (separated by ',' and/or ' ') turns on the filter.
- Use the 1st value if the number of samples is below FILTER_FORMULA_MIN_SAMPLES
- Use the 2nd value if the number of samples is above FILTER_FORMULA_MAX_SAMPLES
- If the number of samples is between the MIN & MAX, a logscale is used:
(minVal - maxVal) * (log(maxSamples) - log(numSamples)) / (log(maxSamples) - log(minSamples)) + maxVal
with:
FILTER_FORMULA_MIN_SAMPLES = 100
FILTER_FORMULA_MAX_SAMPLES = 1000
To add additional filters, set FILTER_ADDITIONAL with the --min/max specified below and the appropriate value, like:
FILTER_ADDITIONAL = --maxSTP 5 --minLQZ 5
Filter Prefix | Filter Description | Filter if | Default Value |
---|---|---|---|
FFRQ | TBD winFFRQ, maxFFRQ both > 0 | ||
STP | INFO:STP > maxSTP < INT_MAX | ||
TTT | INFO:TTT > maxTTT < INT_MAX | ||
ttt | INFO:TTT < minTTT > INT_MIN | ||
LQZ | INFO:LQZ > maxLQZ < INT_MAX | ||
lqz | INFO:LQZ < minLQZ > INT_MIN | ||
RBZ | INFO:RBZ > maxRBZ < INT_MAX | ||
rbz | INFO:RBZ < minRBZ > INT_MIN | ||
CBZ | Cycle Bias z-score | INFO:CBZ > maxCBZ < INT_MAX | |
cbr | INFO:CBR < minCBR/100. > -1 | ||
QBR | INFO:QBR > maxQBR/100. < 1 | ||
qbr | INFO:QBR < minQBR/100. > -1 | ||
CSR | Cycle-Strand Peason's Correlation | INFO:CSR > maxCSR/100. < 1 | |
csr | Cycle-Strand Peason's Correlation | INFO:CSR < minCSR/100. > -1 | |
IOZ | Base quality inflation z-score | INFO:IOZ > maxIOZ < INT_MAX | |
ior | Ratio of base-quality inflation | INFO:IOR < minIOR/100. > INT_MIN/100. | |
MQ10_ | Fraction of bases with mapQ<=10 | INFO:MQ10 > maxMQ10/100. < 1 | |
MQ20_ | Fraction of bases with mapQ<=20 | INFO:MQ20 > maxMQ20/100. < 1 | |
ABE | INFO:ABE > maxABE/100. < 1 | ||
abe | INFO:ABE < minABE/100. > -1 | ||
MBR | INFO:MBR > maxMBR/100. < 1 | ||
mbr | INFO:MBR < minMBR/100. > -1 | ||
ABZ | INFO:ABZ > maxABZ < INT_MAX | ||
abz | INFO:ABZ < minABZ > INT_MIN | ||
BCS | INFO:BCS > maxBCS < INT_MAX |