Assessing read quality then trimming and filtering reads

In the Prenomics course and in a previous lesson, you learned how to use the bash shell to interact with your computer through a command line interface. In this lesson and the next, you will learn more about applying this new knowledge to begin a common genomics workflow - identifying variants among sequencing samples taken from multiple individuals within a population. In this lesson we will start with a set of sequenced reads (.fastq files) and perform some quality control steps. You will also learn about organising a genomics workflow and why metadata is an important consideration.

As you progress through this lesson, keep in mind that, even if you aren’t going to be doing this same workflow in your research, you will be learning some very important lessons about using command-line bioinformatic tools. What you learn here will enable you to use a variety of bioinformatic tools with confidence and greatly enhance your research efficiency and productivity.

Getting Started

This lesson assumes no prior experience with the tools covered in the course. However, learners are expected to have some familiarity with biological concepts, including the concept of genomic variation within a population, as well as some basic experience using a command line interface to navigate file systems.

For a beginner-level overview of the command line, see the Cloud-SPAN Prenomics pages. If you are unsure whether your skills/experience are sufficient, why not try our self-assessment quiz to test your knowledge?

This lesson is part of a course that uses data hosted on an Amazon Machine Instance (AMI). Course participants will be given information on how to log-in to the AMI during the course. Information on preparing for the course is provided on the Cloud-SPAN Genomics setup page.

Schedule

00:00 1. Assessing Read Quality How can I describe the quality of my data?
00:50 2. Trimming and Filtering How can I get rid of sequence data that doesn’t meet my quality standards?
01:45 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.