From Genome Analysis Wiki
Jump to navigationJump to search
360 bytes removed
, 14:04, 23 October 2014
Line 89: |
Line 89: |
| | | |
| === Reference Files === | | === Reference Files === |
| + | See [[GotCloud: Genetic Reference and Resource Files]] for detailed information about the multiple required reference files for the alignment pipeline, including: |
| + | * How to obtain default references |
| + | * Configuration keys & default values |
| + | * How to generate your own references |
| + | * How to point GotCloud to your reference files |
| | | |
− | The following Reference Files are required:
| + | Required Reference File Types: |
− | * Reference File fasta files
| + | * [[GotCloud: Genetic Reference and Resource Files#Reference fasta Files|Reference fasta Files]] |
− | ** Files required: .fa, -bs.umfa, .GCContent, .amb, .ann, .bwt, .pac, .sa
| + | * [[GotCloud: Genetic Reference and Resource Files#DBSNP VCF Files|DBSNP VCF Files]] |
− | *** If you don't have the -bs.umfa file, the software will try to create it in the same directory as the reference fasta. | + | * [[GotCloud: Genetic Reference and Resource Files#HapMap3 VCF Files|HapMap3 VCF Files]] |
− | *** .GCContent can be generated using qplot, see: [[QPLOT#Input_files| QPLOT: Input Files: --gccontent]] and name the resulting file as <code>.fa.GCcontent</code>
| |
− | *** Use <code>bin/bwa index ref.fa</code> if you need to generate the bwa reference files (.amb, .ann, .bwt, .pac, .sa) | |
− | ** Configuration Name: REF - specify the ref.fa/ref.fa.gz name
| |
− | * DBSNP File - used for recalibration & qplot
| |
− | ** VCF file containing dbsnp variants
| |
− | ** Configuration Name: DBSNP_VCF
| |
− | * HapMap3 VCF - used for VerifyBamID
| |
− | ** VCF file containing HM3 variants | |
− | ** Configuration Name: HM3_VCF
| |
− | | |
− | For more information on obtaining/setting/generating the GotCloud reference files, see: [[GotCloud: Genetic Reference and Resource Files]]
| |
| | | |
| === Configuration File === | | === Configuration File === |