From Genome Analysis Wiki
Jump to navigationJump to search
817 bytes added
, 12:02, 20 January 2012
Line 72: |
Line 72: |
| --qGeno : Assigns genotype likelihood on the VCF file with fixed quality values (useful for data integration) | | --qGeno : Assigns genotype likelihood on the VCF file with fixed quality values (useful for data integration) |
| | | |
− | == Subsetting | + | == Subsetting the VCF file == |
| + | |
| + | Suppose that you have the following index file consisting of subset of individuals in the VCF file as [subset-index] |
| + | |
| + | IND_ID_1 GROUP1,GROUP2,GROUP3 |
| + | IND_ID_2 GROUP2 |
| + | IND_ID_3 GROUP1,GROUP3 |
| + | IND_ID_4 GROUP2,GROUP3 |
| + | |
| + | If you run the following command: |
| + | vcfCooker --in-vcf [input-vcf-file] --out [output-prefix] --verbose --subset --in-subset [subset-index] --bgzf |
| + | |
| + | Will create the following set of files |
| + | [output-prefix].GROUP1.vcf.gz |
| + | [output-prefix].GROUP2.vcf.gz |
| + | [output-prefix].GROUP3.vcf.gz |
| + | |
| + | Where each VCF contains a marker polymorphic only within the group (AC>0). AC and AN fields will be updated reflecting the changes in the subset. |
| + | |
| + | Additional Options Includes |
| + | --mono-subset : Includes monomorphic SNPs for the subsetting |
| + | --filt-only-subset : Use PASS-filter SNPs only for subsetting. |
| | | |
| == Upgrading glfMultiples outputs (v 3.3 to v 4.0) == | | == Upgrading glfMultiples outputs (v 3.3 to v 4.0) == |