Week 4: Shell scripting & CLI tools

Author

Jelmer Poelstra

Published

March 22, 2024

Modified

March 27, 2024

1 Links

Lecture pages

Tuesday – Shell scripts.
Thursday – Running CLI tools with shell scripts.
Bonus (optional self-study) content: While loops, arrays, and more.

Exercises & assignments

Exercises for this week
Your final project proposal will be due on Monday of week 6 (April 8^th)

2 Content overview

This week, we will talk about shell scripting. So far, we have been running commands interactively in the shell, one line at a time. When you need to repeat a certain sequence of commands regularly, or run a bioinformatics program that may take a while, it becomes useful to put your shell commands in a script. Such a script can be easily and quickly (re-)executed, or submitted to a queue on a cluster (the latter is next week’s topic).

We will also start practicing with running programs/tools with a command-line interface (CLI), focusing on bioinformatics/genomics tools, and doing so inside shell scripts.

Some of the things you will learn this week:

Why it is useful to collect your commands into shell scripts that can be rerun easily.
The basics of shell scripts including hell script header lines.
Why and how to adorn scripts with tests and echo statements.
More on shell variables and how to use them.
Using command-line arguments with your own scripts.
if statements and true/false tests.
Running command-line programs (we focus on bioinformatics tools), and running them using shell scripts.

2.1 Readings

This week’s reading is Chapter 12 from the Buffalo book.

The latter part of this chapter is about using find, xargs, and Makefiles. These are somewhat tangential to the week’s topic of scripts, and we will not talk about them in class.

I would recommend to read that part of the chapter only if you want. As for Makefiles specifically, it will be good to understand the principle behind them, but there is no need to fully understand the syntax, since we will learn about Nextflow, an alternative approach to workflow management, later in the course.

Required readings

Buffalo Chapter 12: “Bioinformatics Shell Scripting, Writing Pipelines, and Parallelizing Tasks”