In the Prenomics course and in a previous lesson, you learned how to use the bash shell to interact with your computer through a command line interface. In this
lesson and the next, you will learn more about applying this new knowledge to begin a common genomics workflow - identifying variants among sequencing samples
taken from multiple individuals within a population. In this lesson we will start with a set of sequenced reads (
.fastq files) and perform
some quality control steps. You will also learn about organising a genomics workflow and why metadata is an important consideration.
As you progress through this lesson, keep in mind that, even if you aren’t going to be doing this same workflow in your research, you will be learning some very important lessons about using command-line bioinformatic tools. What you learn here will enable you to use a variety of bioinformatic tools with confidence and greatly enhance your research efficiency and productivity.
This lesson assumes no prior experience with the tools covered in the course. However, learners are expected to have some familiarity with biological concepts, including the concept of genomic variation within a population, as well as some basic experience using a command line interface to navigate file systems.
For a beginner-level overview of the command line, see the Cloud-SPAN Prenomics pages. If you are unsure whether your skills/experience are sufficient, why not try our self-assessment quiz to test your knowledge?
This lesson is part of a course that uses data hosted on an Amazon Machine Instance (AMI). Course participants will be given information on how to log-in to the AMI during the course. Information on preparing for the course is provided on the Cloud-SPAN Genomics setup page.