Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,483 bytes added ,  11:51, 12 January 2016
Line 1: Line 1:  
__TOC__
 
__TOC__
   −
==Running Gotcloud on Flux==
+
==How to run Gotcloud on Flux==
    
===First, Configure GotCloud like you would anywhere else===
 
===First, Configure GotCloud like you would anywhere else===
Line 16: Line 16:     
5. Run gotcloud with zero jobs to generate a Makefile.
 
5. Run gotcloud with zero jobs to generate a Makefile.
: <code>/path/to/gotcloud/gotcloud snpcall --conf /path/to/configuration.conf --numjobs 0</code>
+
: <code style="background:#f0f0f0">/path/to/gotcloud/gotcloud snpcall --conf /path/to/configuration.conf --numjobs 0</code>
 
:* The newly generated Makefile will be located in the directory <code>OUT_DIR</code> that is specified in your configuration file.  It will be named <code>umake.snpcall.Makefile</code>.
 
:* The newly generated Makefile will be located in the directory <code>OUT_DIR</code> that is specified in your configuration file.  It will be named <code>umake.snpcall.Makefile</code>.
   Line 22: Line 22:     
7. Make an email address to send your jobs' status to.  It'll be hit by hundreds or thousands of emails, so I recommend that you don't use your main email address here.   
 
7. Make an email address to send your jobs' status to.  It'll be hit by hundreds or thousands of emails, so I recommend that you don't use your main email address here.   
:* If you're in a hurry to finish your pipeline, you can find an email address that will text the emails to your phone.  Only use that in the second script, though!
+
:* If you're in a hurry to finish your pipeline, you can find an email address that will text the emails to your phone.  Only use that in the second script, though, so that you don't receive thousands of text messages!
   −
8. Figure out the name of the Flux account that you're going to use.  You can see which Flux accounts you have access to by running <code>mdiag -u $USER</code> and looking at the list after <code>ALIST</code>.
+
8. Figure out the name of the Flux account that you're going to use.  You can see which Flux accounts you have access to by running <code style="background:#eee;white-space:nowrap">mdiag -u $USER</code> and looking at the list after <code>ALIST</code>.
 
:* Eg, <code>sph_flux</code>
 
:* Eg, <code>sph_flux</code>
   −
9. Figure out how many processors you're going to use at once.  Run <code>mdiag -a YOU_FLUX_ACCOUNT</code>.  I recommend running <code>MAXPROC</code> + <code>MAXIJOB[USER]</code> many jobs.  <code>MAXPROC</code> is the number of processors on your account, and <code>MAXIJOB[USER]</code> is the number of jobs that can sit idle in the queue waiting to be run (often 20).
+
9. Figure out how many processors you're going to use at once.  Run <code style="background:#eee;white-space:nowrap">mdiag -a YOU_FLUX_ACCOUNT</code>.  I recommend running <code>MAXPROC</code> + <code>MAXIJOB[USER]</code> many jobs.  <code>MAXPROC</code> is the number of processors on your account, and <code>MAXIJOB[USER]</code> is the number of jobs that can sit idle in the queue waiting to be run (often 20).
 
:* This number will usually be between 20 and 1000.
 
:* This number will usually be between 20 and 1000.
    
10. Figure out which steps to run first.  The steps go in the order glfN, vcfN, pvcfN, filtN, svmN, splitN, allN where N is the name of a chromosome (ie, 1-22 and maybe X and Y).  If you skip a step, it's not a problem, because <code>make</code> will run it for you.  If you're confident that everything will work beautifully, you can go straight to the step <code>allN</code> (or just <code>all</code> as a shortcut).
 
10. Figure out which steps to run first.  The steps go in the order glfN, vcfN, pvcfN, filtN, svmN, splitN, allN where N is the name of a chromosome (ie, 1-22 and maybe X and Y).  If you skip a step, it's not a problem, because <code>make</code> will run it for you.  If you're confident that everything will work beautifully, you can go straight to the step <code>allN</code> (or just <code>all</code> as a shortcut).
:* For example, I used <code>glf1 glf2 glf3 glf4 glf5 glf6 glf7 glf8 glf9 glf10 glf11 glf12 glf13 glf14 glf15 glf16 glf17 glf18 glf19 glf20 glf21 glf22 </code> the first time I ran on Flux.  Then I ran <code>vcf1 vcf2...</code>, and on down the list until finally <code>all</code>.
+
:* For example, I used <code style="background:#eee;white-space:nowrap">glf1 glf2 glf3 glf4 glf5 glf6 glf7 glf8 glf9 glf10 glf11 glf12 glf13 glf14 glf15 glf16 glf17 glf18 glf19 glf20 glf21 glf22 </code> the first time I ran on Flux.  Then I ran <code>vcf1 vcf2...</code>, and on down the list until finally <code>all</code>.
:* Feel free to use the script <code>perl -e 'print "glf$_ " for 1..22'</code> to mitigate repetitive strain injuries.
+
:* Feel free to use the script <code style="background:#eee;white-space:nowrap">perl -e 'print "glf$_ " for 1..22'</code> to mitigate repetitive strain injuries.
    
11. Inside that new folder, make a new file named <code>pbs.options</code> that contains the following:
 
11. Inside that new folder, make a new file named <code>pbs.options</code> that contains the following:
Line 65: Line 65:  
===Finally, submit your jobs===
 
===Finally, submit your jobs===
   −
13. Run <code>qsub script_thats_in_charge.sh</code>.  It's important that you run this in the same folder where <code>pbs.options</code> lives.
+
13. Run <code style="background:#eee;white-space:nowrap">qsub script_thats_in_charge.sh</code>.  It's important that you run this in the same folder where <code>pbs.options</code> lives.
    
14. Once that finishes, if any steps remain, then update YOUR_MAKEFILE_TARGETS_FROM_STEP_10 and go back to step 13.
 
14. Once that finishes, if any steps remain, then update YOUR_MAKEFILE_TARGETS_FROM_STEP_10 and go back to step 13.
 +
 +
==How to monitor your jobs on Flux==
 +
To see a summary of the states of your jobs, run <code style="background:#eee;white-space:nowrap">showq -u $USER -s</code>.
 +
* <code>active</code> means that a job is currently running.  This is good.
 +
** To see more information about active jobs, run <code style="background:#eee;white-space:nowrap">showq -n -v -u $USER -r</code>
 +
* <code>eligible</code> means that the scheduler will submit a job in a few minutes if you're not already using all of your processors.
 +
** To see more information about eligible jobs, run <code style="background:#eee;white-space:nowrap">showq -n -v -u $USER -i</code>
 +
* <code>blocked</code> is usually a bad thing.  It might mean that you have too many jobs waiting to run, and so the scheduler has blocked some.  Or it can mean that you broke some rule, and they'll never work.
 +
** To see more information about active jobs, run <code style="background:#eee;white-space:nowrap">showq -n -v -u $USER -b</code>
 +
 +
To see more information about a particular job, copy its JOB_ID (eg, <code>17682208/17682208.nyx.arc-ts.umich.edu</code>).  Then run <code style="background:#eee;white-space:nowrap">checkjob JOB_ID</code>.
61

edits

Navigation menu