Data processing and analysis: Glossary

Key Points

Trimming and Filtering
  • The options you set for the command-line tools you use are important!

  • Data cleaning is an essential step in a genomics workflow.

Variant Calling Workflow
  • Bioinformatic command line tools are collections of commands that can be used to carry out bioinformatic analyses.

  • To use most powerful bioinformatic tools, you’ll need to use the command line.

  • There are many different file formats for storing genomics data. It’s important to understand what type of information is contained in each file, and how it was derived.

Automating a Variant Calling Workflow
  • We can combine multiple commands into a shell script to automate a workflow.

  • Use echo statements within your scripts to get an automated progress update.

Glossary

FIXME