Tutorial: Low Pass Sequence Analysis Answers

From Genome Analysis Wiki
Jump to navigationJump to search

Low Pass Sequence Analysis Answers

  • Q1. What is the base quality of the fifth nucleotide of the third read in the file HG00111.lowcoverage.chr20.smallregion_1.fastq.gz?

The third read in the file is:

@ERR020230.76497044/1
CTGTACTACTAAAGTAAAACTAGTTTTCCAATAGTTTGTTGCAGGATAAGCAGTTTTACTTTTGTTGACAATATGTGTATGAATTTACTTC
+
DFEEGFKIFKIKLKIJLMMIMKMJKKKIKLMKKLKLLLKKLKLMMJLLJMKMMJLKLLJNLLLIKLJMILKLJKLKKKKKMMMJJJIFJFA

The quality string is the 4th line of each read, then the base quality of the first nucleotide is encoded with the character "G". Its decimal ASCII code is 71, so the base quality of this nucleotide is 38 (71-33)

  • Q2. Which is the mean depth of the sample HG00108? And the mapping rate?

The mean depth is 4.60X and mapping rate is 99.19%. However, keep in mind that these statistics are evaluated in the 100kb included in our example dataset.