Difference between revisions of "Tutorial: GotCloud"
m |
m |
||
Line 1: | Line 1: | ||
− | == | + | ==Installation== |
First, make sure GotCloud is installed on your system. Installation instructions [[GotCloud#Setup|here]]. | First, make sure GotCloud is installed on your system. Installation instructions [[GotCloud#Setup|here]]. | ||
− | == | + | ==Running the Automatic Test== |
This verifies that GotCloud was installed correctly. | This verifies that GotCloud was installed correctly. | ||
Line 17: | Line 17: | ||
− | == | + | ==Running an Example Sample== |
As an example, we can analyze the sample files used in the automatic test. | As an example, we can analyze the sample files used in the automatic test. | ||
Line 23: | Line 23: | ||
To make this easier, change to the test/align directory. It contains an index file and a configuration file that can be used directly. | To make this easier, change to the test/align directory. It contains an index file and a configuration file that can be used directly. | ||
− | === | + | ===Index file=== |
There are four fastq files in test/align/fastq/Sample_1 and four fastq files in test/align/fastq/Sample_2, both in paired-end format. Normally, we would need to build an index file for these files. Conveniently, an index file (indexFile.txt) already exists for the automatic test samples. It contains the following information in tab-delimted format: | There are four fastq files in test/align/fastq/Sample_1 and four fastq files in test/align/fastq/Sample_2, both in paired-end format. Normally, we would need to build an index file for these files. Conveniently, an index file (indexFile.txt) already exists for the automatic test samples. It contains the following information in tab-delimted format: | ||
Line 41: | Line 41: | ||
where {ROOT_DIR} is the root directory of your GotCloud installation. | where {ROOT_DIR} is the root directory of your GotCloud installation. | ||
− | === | + | ===Configuration file=== |
Similar to the index file, a configuration file (test.conf) already exists for the automatic test samples. It contains the following information: | Similar to the index file, a configuration file (test.conf) already exists for the automatic test samples. It contains the following information: | ||
Line 56: | Line 56: | ||
If you are in the test/align directory, you can use this file as-is. If you are using a different index file, make sure your index file is named correctly in the first line. | If you are in the test/align directory, you can use this file as-is. If you are using a different index file, make sure your index file is named correctly in the first line. | ||
− | === | + | ===Running the alignment pipeline=== |
You are now ready to run the alignment pipeline. This requires two steps: first, generating the Makefiles; and second, running those Makefiles. | You are now ready to run the alignment pipeline. This requires two steps: first, generating the Makefiles; and second, running those Makefiles. | ||
− | ==== | + | ====Generating the Makefiles==== |
Enter the following command: | Enter the following command: | ||
Line 80: | Line 80: | ||
where {OUTDIR} will be replaced with the directory you entered above. | where {OUTDIR} will be replaced with the directory you entered above. | ||
− | ==== | + | ====Running the Makefiles==== |
To run a Makefile, simply enter one-by-one the commands generated in the previous step. The log files for the runs will be found in the Makefiles directory, while the BAM files will be found in the {OUT_DIR}/alignment.recal directory. | To run a Makefile, simply enter one-by-one the commands generated in the previous step. The log files for the runs will be found in the Makefiles directory, while the BAM files will be found in the {OUT_DIR}/alignment.recal directory. |
Revision as of 00:37, 8 January 2013
Installation
First, make sure GotCloud is installed on your system. Installation instructions here.
Running the Automatic Test
This verifies that GotCloud was installed correctly.
To run the test case automatically, change your current directory to GotCloud's root directory, and type in the following command:
/bin/gen_biopipeline.pl --test OUTPUT_DIR
where OUTPUT_DIR is the directory where you want to store the results.
If you see "Test Passed", then you are ready to run a sample.
Running an Example Sample
As an example, we can analyze the sample files used in the automatic test.
To make this easier, change to the test/align directory. It contains an index file and a configuration file that can be used directly.
Index file
There are four fastq files in test/align/fastq/Sample_1 and four fastq files in test/align/fastq/Sample_2, both in paired-end format. Normally, we would need to build an index file for these files. Conveniently, an index file (indexFile.txt) already exists for the automatic test samples. It contains the following information in tab-delimted format:
MERGE_NAME FASTQ1 FASTQ2 RGID SAMPLE LIBRARY CENTER PLATFORM Sample1 fastq/Sample_1/File1_R1.fastq.gz fastq/Sample_1/File1_R2.fastq.gz RGID1 SampleID1 Lib1 UM ILLUMINA Sample1 fastq/Sample_1/File2_R1.fastq.gz fastq/Sample_1/File2_R2.fastq.gz RGID1a SampleID1 Lib1 UM ILLUMINA Sample2 fastq/Sample_2/File1_R1.fastq.gz fastq/Sample_2/File1_R2.fastq.gz RGID2 SampleID2 Lib2 UM ILLUMINA Sample2 fastq/Sample_2/File2_R1.fastq.gz fastq/Sample_2/File2_R2.fastq.gz RGID2 SampleID2 Lib2 UM ILLUMINA
If you are in the test/align directory, you can use this file as-is. If you prefer, you can create a new index file and change the MERGE_NAME, RGID, SAMPLE, LIBRARY, CENTER, or PLATFORM values. It is recommended that you do not modify existing files in test/align.
If you want to run this example from a different directory, make sure the FASTQ1 and FASTQ2 paths are correct. That is, each of the FASTQ1 and FASTQ2 entry in the index file should look like the following:
{ROOT_DIR}/test/align/fastq/Sample_1/File1_R1.fastq.gz
where {ROOT_DIR} is the root directory of your GotCloud installation.
Configuration file
Similar to the index file, a configuration file (test.conf) already exists for the automatic test samples. It contains the following information:
INDEX_FILE = indexFile.txt ############ # References REF_DIR = $(PIPELINE_DIR)/test/align/chr20Ref AS = NCBI37 FA_REF = $(REF_DIR)/human_g1k_v37_chr20.fa DBSNP_VCF = $(REF_DIR)/dbsnp.b130.ncbi37.chr20.vcf.gz PLINK = $(REF_DIR)/hapmap_3.3.b37.chr20
If you are in the test/align directory, you can use this file as-is. If you are using a different index file, make sure your index file is named correctly in the first line.
Running the alignment pipeline
You are now ready to run the alignment pipeline. This requires two steps: first, generating the Makefiles; and second, running those Makefiles.
Generating the Makefiles
Enter the following command:
{ROOT_DIR}/bin/gen_biopipeline.pl --conf test.conf --out_dir {OUT_DIR}
where {ROOT_DIR} is the root directory of your GotCloud installation, and {OUT_DIR} is the directory in which you wish to store the resulting BAM files.
If everything went well, you will see the following messages:
Finished creating makefile {OUTDIR}/Makefiles/biopipe_Sample2.Makefile Finished creating makefile {OUTDIR}/Makefiles/biopipe_Sample1.Makefile -------------------------------------------------------------------- Run the following commands: make -f {OUTDIR}/Makefiles/biopipe_Sample2.Makefile > {OUTDIR}/Makefiles/biopipe_Sample2.Makefile.log make -f {OUTDIR}/Makefiles/biopipe_Sample1.Makefile > {OUTDIR}/Makefiles/biopipe_Sample1.Makefile.log
where {OUTDIR} will be replaced with the directory you entered above.
Running the Makefiles
To run a Makefile, simply enter one-by-one the commands generated in the previous step. The log files for the runs will be found in the Makefiles directory, while the BAM files will be found in the {OUT_DIR}/alignment.recal directory.