Line 85: |
Line 85: |
| * Subset of FASTQs - should map to chromosome 22 36000000-37000000 | | * Subset of FASTQs - should map to chromosome 22 36000000-37000000 |
| | | |
− | ls ${IN}/fastq/ | + | ls ${SS}/fastq/ |
| There are 24 fastq files: combination of single-end & paired-end. | | There are 24 fastq files: combination of single-end & paired-end. |
| | | |
Line 104: |
Line 104: |
| | | |
| Look at a couple of FASTQs: | | Look at a couple of FASTQs: |
− | less -S ${IN}/fastq/HG00551.SRR190851_1.fastq | + | less -S ${SS}/fastq/HG00551.SRR190851_1.fastq |
| <code>less</code> is a Linux command that allows you to look at a file. | | <code>less</code> is a Linux command that allows you to look at a file. |
| *<code>-S</code> option prevents line wrap | | *<code>-S</code> option prevents line wrap |
Line 124: |
Line 124: |
| | | |
| Look at the paired read: | | Look at the paired read: |
− | less -S ${IN}/fastq/HG00551.SRR190851_2.fastq | + | less -S ${SS}/fastq/HG00551.SRR190851_2.fastq |
| | | |
| Remember, use <code>'q'</code> to exit out of <code>less</code> | | Remember, use <code>'q'</code> to exit out of <code>less</code> |
Line 157: |
Line 157: |
| | | |
| Take a look at the chromosome 22 reference files included for this tutorial: | | Take a look at the chromosome 22 reference files included for this tutorial: |
− | ls ${REF} | + | ls ${SS}/ref22 |
| | | |
| <ul> | | <ul> |
Line 169: |
Line 169: |
| | | |
| Let's read the reference FASTA file (all reference bases for the chromosome): | | Let's read the reference FASTA file (all reference bases for the chromosome): |
− | less ${REF}/human.g1k.v37.chr22.fa | + | less ${SS}/ref22/human.g1k.v37.chr22.fa |
| | | |
| Remember, use <code>'q'</code> to exit out of <code>less</code> | | Remember, use <code>'q'</code> to exit out of <code>less</code> |
Line 175: |
Line 175: |
| | | |
| If you want to access the FASTA file by position, you can use <code>samtools faidx</code> command | | If you want to access the FASTA file by position, you can use <code>samtools faidx</code> command |
− | $GC/bin/samtools faidx $REF/human.g1k.v37.chr22.fa 22:36000000 | less | + | ${GC}/bin/samtools faidx ${SS}/ref22/human.g1k.v37.chr22.fa 22:36000000 | less |
| or | | or |
− | $GC/bin/samtools faidx $REF/human.g1k.v37.chr22.fa 22:36000000-36000100 | + | ${GC}/bin/samtools faidx ${SS}/ref22/human.g1k.v37.chr22.fa 22:36000000-36000100 |
| | | |
| ; Where is the reference sequence? | | ; Where is the reference sequence? |
Line 187: |
Line 187: |
| <li>The ends of a chromosome are 'N' - unknown bases</li> | | <li>The ends of a chromosome are 'N' - unknown bases</li> |
| <li>Let's look at 5 lines of the file starting at line 300,000</li> | | <li>Let's look at 5 lines of the file starting at line 300,000</li> |
− | tail -n+300000 ${REF}/human.g1k.v37.chr22.fa |head -n 5 | + | tail -n+300000 ${SS}/ref22/human.g1k.v37.chr22.fa |head -n 5 |
| [[File:Fasta.png|500px]] | | [[File:Fasta.png|500px]] |
| </div> | | </div> |