From Genome Analysis Wiki
Jump to navigationJump to search
1,928 bytes added
, 16:42, 10 February 2015
Line 40: |
Line 40: |
| | | |
| == Input Data == | | == Input Data == |
| + | == Input Data== |
| + | * [[#BAM Files|Aligned/Processed/Recalibrated BAM files]] |
| + | * [[#BAM List File|BAM list file containing Sample IDs & BAM file names]] |
| + | * [[#Reference Files|Reference files]] |
| + | * [[#Configuration File|Configuration file to override default options]] |
| + | |
| + | === BAM Files === |
| + | The BAM files need to be duplicate-marked and base-quality recalibrated in order to obtain high quality SNP calls. Generating these BAM files from original FASTQs is automatically done as part of the [[Alignment Pipeline]] of GotCloud. |
| + | |
| + | === BAM List File === |
| + | * Automatically created when running the GotCloud [[Alignment Pipeline]] |
| + | * Each line of the BAM list file represents a single individual |
| + | |
| + | Columns: |
| + | # sample id |
| + | # comma separated population labels (optional column) |
| + | # BAM File 1 (preferable to have full paths to BAM files) |
| + | # BAM File 2 (if more than 1 BAM per sample) |
| + | :... |
| + | |
| + | : # BAM File N (if more than 1 BAM per sample) |
| + | [SAMPLE_ID] [COMMA SEPARATED POPULATION LABELS] [BAM_FILE1] [BAM_FILE2] ... |
| + | or |
| + | [SAMPLE_ID] [BAM_FILE1] [BAM_FILE2] ... |
| + | |
| + | * Notes: |
| + | ** tab delimited |
| + | ** multiple BAMs per individual may be provided, but should all be on the same line of the list file |
| + | ** population label is optional - it will default to <code>ALL</code> |
| + | *** only used by Thunder (part of ldrefine pipeline) |
| + | *** if all samples are from the same population, population label can be skipped or you can just specify <code>ALL</code> for the population label for each sample. |
| + | |
| + | === Reference Files === |
| + | See [[GotCloud: Genetic Reference and Resource Files]] for detailed information about the multiple required reference files for the variant calling pipeline, including: |
| + | * How to obtain default references |
| + | * Configuration keys & default values |
| + | * How to generate your own references |
| + | * How to point GotCloud to your reference files |
| + | |
| + | Required Reference File Types: |
| + | * [[GotCloud: Genetic Reference and Resource Files#Reference fasta Files|Reference fasta Files]] |
| + | |
| === Configuration File === | | === Configuration File === |
| {{:GotCloud: Configuration}} | | {{:GotCloud: Configuration}} |
Line 49: |
Line 91: |
| | | |
| '''Replace the specified paths to the path to these files.''' | | '''Replace the specified paths to the path to these files.''' |
− |
| |
| | | |
| == Running GotCloud/GenomeSTRiP == | | == Running GotCloud/GenomeSTRiP == |