Slurm batch jobs at OSC
Week 5 - part III
Overview
Automated scheduling software allows hundreds of people with different requirements to access supercomputer compute nodes effectively and fairly. OSC uses Slurm (Simple Linux Utility for Resource Management) for this.
As you’ve learned, a reservation of resources on compute nodes is called a compute job. Here are the main ways to start a compute job at OSC:
- “Interactive Apps” — Run programs with GUIs (e.g. VS Code or RStudio) directly on the OnDemand website.
- Interactive shell jobs — Start an interactive shell on a compute node.
- Batch (non-interactive) jobs — Run a script on a compute node without ever going to that node yourself.
We’ve already worked a lot with the VS Code Interactive App, and the self-study material at the bottom of this page will cover interactive shell jobs. What we’ll focus on in this session are batch jobs.
Setting up
Let’s get set up by:
Moving the
garrigos_data
dir one level up, out ofweek04
(since you’ll keep using this data this and next week):# You should be in /fs/ess/PAS2700/users/$USER mv week04/garrigos_data . ls
CSB garrigos_data week02 week03 week04
Creating a dir for this week:
mkdir -p week05/class_slurm/scripts cd week05/class_slurm
Copying two scripts from last week which you’ll use again:
cp /fs/ess/PAS2700/users/$USER/week04/scripts/printname.sh scripts/ cp /fs/ess/PAS2700/users/$USER/week04/scripts/fastqc.sh scripts/
1 Basics of Slurm batch jobs
When you request a batch job, you ask the Slurm scheduler to run a script “out of sight” on a compute node. While that script will run on a compute node, you stay in your current shell at your current node regardless of whether that is on a login or compute node. After submitting a batch job, it will continue running even if you log off from OSC and shut down your computer.
1.1 The sbatch
command
You can use Slurm’s sbatch
command to submit a batch job. But first, recall from last week that you can directly run a Bash script as follows:
bash scripts/printname.sh Jane Doe
This script will print a first and a last name
First name: Jane
Last name: Doe
The above command ran the script on our current node. To instead submit the script to the Slurm queue, simply replace bash
by sbatch
:
sbatch scripts/printname.sh Jane Doe
srun: error: ERROR: Job invalid: Must specify account for job
srun: error: Unable to allocate resources: Unspecified error
However, as the above error message “Must specify account for job” tells us, you need to indicate which OSC Project (or as Slurm puts it, “account”) you want to use for this compute job. Use the --account=
option to sbatch
to do this:
sbatch --account=PAS2700 scripts/printname.sh Jane Doe
Submitted batch job 12431935
This output line means your job was successfully submitted (no further output will be printed to your screen — more about that below). The job has a unique identifier among all compute jobs by all users at OSC, and we can use this number to monitor and manage it. Each of us will therefore see a different job number pop up.
sbatch
options and script arguments
As you perhaps noticed in the command above, we can use sbatch
options and script arguments in one command like so:
sbatch [sbatch-options] myscript.sh [script-arguments]
But, depending on the details of the script itself, all combinations of using sbatch
options and script arguments are possible:
sbatch scripts/printname.sh # No options/arguments for either
sbatch scripts/printname.sh Jane Doe # Script arguments but no sbatch option
sbatch --account=PAS2700 scripts/printname.sh # sbatch option but no script arguments
sbatch --account=PAS2700 scripts/printname.sh Jane Doe # Both sbatch option and script arguments
(Omitting the --account
option is possible when we specify this option inside the script, as we’ll see below.)
1.2 Adding sbatch
options in scripts
The --account=
option is just one of many options you can use when reserving a compute job, but is the only required one. Defaults exist for all other options, such as the amount of time (1 hour) and the number of cores (1 core).
Instead of specifying sbatch
options on the command-line when submitting the script, you can also add these options inside the script. This is a useful alternative because:
- You’ll often want to specify several options, which can lead to very long
sbatch
commands. - It allows you to store a script’s typical Slurm options as part of the script, so you don’t have to remember them.
These options are added in the script using another type of special comment line akin to the shebang (#!/bin/bash
) line, marked by #SBATCH
. Just like the shebang line, the #SBATCH
line(s) should be at the top of the script. Let’s add one such line to the printname.sh
script, such that the first few lines read:
#!/bin/bash
#SBATCH --account=PAS2700
set -euo pipefail
So, the equivalent of adding --account=PAS2700
after sbatch
on the command line is a line in your script that reads #SBATCH --account=PAS2700
.
After adding this to the script, you are now able to run the sbatch
command without options (which failed earlier):
sbatch scripts/printname.sh Jane Doe
Submitted batch job 12431942
After submitting a batch job, you immediately get your prompt back. The job will run outside of your immediate view, and you can continue doing other things in the shell while it does (or log off). This behavior allows you to submit many jobs at the same time, because you don’t have to wait for other jobs to finish, or even to start.
sbatch
option precedence!
Any sbatch
option provided on the command line will override the equivalent option provided inside the script. This is sensible because it allows you to provide “defaults” inside the script, and change one or more of those when needed “on the go” on the command line.
#SBATCH
lines in non-Slurm contexts (Click to expand)
Because #SBATCH
lines are special comment lines, they will simply be ignored (and not throw any errors) when you run a script with such lines in other contexts: for example, when not running it as a batch job at OSC, or even when running it on a computer without Slurm installed.
1.3 Where does the script’s output go?
Above, we saw that when you ran printname.sh
directly with bash
, its output was printed to the screen, whereas when you submitted it as a batch job, only Submitted batch job <job-number>
was printed to screen. Where did your output go?
The output ended up in a file called slurm-<job-number>.out
(e.g., slurm-12431942.out
; since each job number is unique to a given job, each file has a different number). We will call this type of file a Slurm log file.
Any idea why we may not want batch job output printed to screen, even if it was possible? (Click for the answer)
The power of submitting batch jobs is that you can submit many at once — e.g. one per sample, running the same script. If the output from all those scripts ends up on your screen, things become a big mess, and you have no lasting record of what happened.You should already have two of these Slurm log files if you ran all the above code:
ls
scripts slurm-12431935.out slurm-12431942.out
Let’s take a look at the contents of one of these:
# (Replace the number in the file name with whatever you got! - check with 'ls')
cat slurm-12431935.out
This script will print a first and a last name
First name: Jane
Last name: Doe
This file contains the script’s output that was printed to screen when we ran it with bash
— nothing more or less.
Two types of output files
It’s important to realize the distinction between two broad types of output a script may have:
Output that is printed to screen when you directly run a script (
bash myscript.sh
), and that ends up in the Slurm log file when you submit the script as a batch job. This includes output produced byecho
statements, by any errors that may occur, and logging output by any program that we run in the script1.Output of commands inside the script that is redirected to a file or that a program writes to an output file. This type of output will end up in the exact same files regardless of whether we run the script directly (with
bash
) or as a batch job (withsbatch
).
Our script above only had the first type of output, but typical scripts have both, and we’ll see examples of this below.
Cleaning up the Slurm logs
When using batch jobs, your working dir can easily become a confusing mess of anonymous-looking Slurm log files. Two strategies help to prevent this:
- Changing the default Slurm log file name to include a one- or two-word description of the job/script (see below).
- Cleaning up your Slurm log files, by:
- Removing them when no longer needed — as is e.g. appropriate for our current Slurm log file.
- Moving them into a Results dir, which is often appropriate after you’ve run a bioinformatics tool, since the Slurm log file may contain some info you’d like to keep. For example, we may move any Slurm log files for jobs that ran FastQC to a dir
results/fastqc/logs
.
# In this case, we'll simply remove the Slurm log files
rm slurm*out
Batch jobs start in the directory that they were submitted from: that is, your working directory remains the same.
2 Monitoring batch jobs
When submitting batch jobs for your research, you’ll often have jobs that run for a while, and/or you’ll submit many jobs at once. In addition, longer-running jobs and that ask for many cores sometimes remain queued for a while before they start. It’s therefore important to know how you can monitor your batch jobs.
2.1 A sleepy script for practice
We’ll use another short shell script to practice monitoring and managing batch jobs. First create a new file:
touch scripts/sleep.sh
Open the file in the VS Code editor and copy the following into it:
#!/bin/bash
#SBATCH --account=PAS2700
echo "I will sleep for 30 seconds" > sleep.txt
sleep 30s
echo "I'm awake! Done with script sleep.sh"
Exercise: Batch job output recap
Predict what would happen if you submit the sleep.sh
script as a batch job using sbatch scripts/sleep.sh
:
- How many output files will this batch job produce?
- What will be in each of those files?
- In which directory will the file(s) appear?
- In terms of output, what would have been different if we had run the script directly, using the command
bash scripts/sleep.sh
?
Then, test your predictions by running the script.
Click for the solutions
The job will produce 2 files:
slurm-<job-number>.out
: The Slurm log file, containing output normally printed to screen.sleep.txt
: Containing output that was redirected to this file in the script.
The those files will contain the following:
slurm-<job-number>.out
: I’m awake! Done with script sleep.shsleep.txt
: “I will sleep for 30 seconds”
Both files will end up in your current working directory. Slurm log files always go to the directory from which you submitted the job. Slurm jobs also run from the directory from which you submitted your job, and since we redirected the output simply to
sleep.txt
, that file was created in our working directory.If we had run the script directly,
sleep.txt
would have also been created with the same content, but “All done!” would have been printed to screen.
Run the script and check the outputs:
sbatch scripts/sleep.sh
Submitted batch job 27935840
cat sleep.txt
I will sleep for 30 seconds
cat slurm-27935840.out
I'm awake! Done with script sleep.sh
2.2 Checking the job’s status
After you submit a job, it may be initially be waiting to be allocated resources: i.e., it may be queued (“pending”). Then, the job will start running — you’ve seen all of this with the VS Code Interactive App job as well.
Whereas Interactive App jobs will keep running until they’ve reached the end of the allocated time2, batch jobs will stop as soon as the script has finished. And if the script is still running when the job runs out of its allocated time, it will be killed (stopped) right away.
The squeue
command
You can check the status of your batch job using the squeue
Slurm command:
squeue -u $USER -l
Thu Apr 4 15:47:51 2023
JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON)
23640814 condo-osu ondemand jelmer RUNNING 6:34 2:00:00 1 p0133
In the command above:
- You specify your username with the
-u
option (without this, you’d see everyone’s jobs!). In this example, I used the environment variable$USER
to get your user name, just so that the very same code will work for everyone (you can also simply type your username if that’s shorter or easier). - The option
-l
(lowercase L, not the number 1) will produce more verbose (“long”) output.
In the output, after a line with the date and time, and a header line, you should see information about a single compute job, as shown above: this is the Interactive App job that runs VS Code. That’s not a “batch” job, but it is a compute job, and all compute jobs are listed.
The following pieces of information about each job are listed:
JOBID
— The job ID numberPARTITION
— The type of queueNAME
— The name of the jobUSER
— The user name of the user who submitted the jobSTATE
— The job’s state, usuallyPENDING
(queued) orRUNNING
. Finished jobs do not appear on the list.TIME
— For how long the job has been running (here as minutes:seconds)TIME_LIMIT
— the amount of time you reserved for the job (here as hours:minutes:seconds)NODES
— The number of nodes reserved for the jobNODELIST(REASON)
— When running: the ID of the node on which it is running. When pending: why it is pending.
squeue
example
Now, let’s see a batch job in the squeue
listing. Start by submitting the sleep.sh
script as a batch job:
sbatch scripts/sleep.sh
Submitted batch job 12431945
If you’re quick enough, you may be able to catch the STATE
as PENDING
before the job starts:
squeue -u $USER -l
Thu Apr 4 15:48:26 2023
JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON)
12520046 serial-40 sleep.sh jelmer PENDING 0:00 1:00:00 1 (None)
23640814 condo-osu ondemand jelmer RUNNING 7:12 2:00:00 1 p0133
But soon enough it should say RUNNING
in the STATE
column:
squeue -u $USER -l
Thu Apr 4 15:48:39 2023
JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON)
12520046 condo-osu sleep.sh jelmer RUNNING 0:12 1:00:00 1 p0133
23640814 condo-osu ondemand jelmer RUNNING 8:15 2:00:00 1 p0133
The script should finish after 30 seconds (because your command was sleep 30s
), after which the job will immediately disappear from the squeue
listing, because only pending and running jobs are shown:
squeue -u $USER -l
Mon Aug 21 15:49:26 2023
JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON)
23640814 condo-osu ondemand jelmer RUNNING 9:02 2:00:00 1 p0133
Checking the output files
Whenever you’re running a script as a batch job, even if you’ve been monitoring it with squeue
, you should also make sure it ran successfully. You typically do so by checking the expected output file(s). As mentioned above, you’ll usually have two types of output from a batch job:
- File(s) directly created by the command inside the script (here,
sleep.sh
). - A Slurm log file with the script’s standard output and standard error (i.e. output that is normally printed to screen).
And you saw in the exercise above that this was also the case for the output of our sleepy script:
cat sleep.txt
I will sleep for 30 seconds
cat slurm-12520046.out
I'm awake! Done with script sleep.sh
Let’s keep things tidy and remove the sleepy script outputs:
# (Replace the number in the file name with whatever you got! - check with 'ls')
rm slurm*.out sleep.txt
Text will be added to the Slurm log file in real time as the running script (or the program ran by the script) outputs it. However, the output that commands like cat
and less
print are static.
Therefore, if you find yourself opening/printing the contents of the Slurm log file again and again to keep track of progress, then instead use tail -f
, which will “follow” the file and will print new text as it’s added to the Slurm log file:
# See the last lines of the file, with new contents added in real time
tail -f slurm-12520046.out
To exit the tail -f
livestream, press Ctrl+C.
2.3 Cancelling jobs
Sometimes, you want to cancel one or more jobs, because you realize you made a mistake in the script, or because you used the wrong input files as arguments. You can do so using scancel
:
# [Example - DON'T run this: the second line would cancel your VS Code job]
scancel 2979968 # Cancel job number 2979968
scancel -u $USER # Cancel all your running and queued jobs (careful with this!)
Use
squeue
’s-t
option to restrict the type of jobs you want to show. For example, to only show running and not pending jobs:squeue -u $USER -t RUNNING
You can see more details about any running or finished job, including the amount of time it ran for:
scontrol show job <jobID>
UserId=jelmer(33227) GroupId=PAS0471(3773) MCS_label=N/A Priority=200005206 Nice=0 Account=pas2700 QOS=pitzer-default JobState=RUNNING Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:02:00 TimeLimit=01:00:00 TimeMin=N/A SubmitTime=2020-12-14T14:32:44 EligibleTime=2020-12-14T14:32:44 AccrueTime=2020-12-14T14:32:44 StartTime=2020-12-14T14:32:47 EndTime=2020-12-14T15:32:47 Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2020-12-14T14:32:47 Partition=serial-40core AllocNode:Sid=pitzer-login01:57954 [...]
Update directives for a job that has already been submitted (this can only be done before the job has started running):
scontrol update job=<jobID> timeLimit=5:00:00
Hold and release a pending (queued) job, e.g. when needing to update input file before it starts running:
scontrol hold <jobID> # Job won't start running until released scontrol release <jobID> # Job is free to start
3 Common Slurm options
Here, we’ll go through the most commonly used Slurm options. As pointed out above, each of these can either be:
- Passed on the command line:
sbatch --account=PAS2700 myscript.sh
(has precedence over the next) - Added at the top of the script you’re submitting:
#SBATCH --account=PAS2700
.
Also, note that many Slurm options have a corresponding long (--account=PAS2700
) and short format (-A PAS2700
). For clarity, we’ll stick to long format options here.
3.1 --account
: The OSC project
As seen above. When submitting a batch job, always specify the OSC project (“account”).
3.2 --time
: Time limit (“wall time”)
Use the --time
option to specify the maximum amount of time your job will run for:
Your job will be killed (stopped) as soon as it hits the specified time limit!
Compare “Wall time” with “core hours”: if a job runs for 2 hour and used 8 cores, the wall time was 2 hours and the number of core hours was 2 x 8 = 16.
The default time limit is 1 hour. Acceptable time formats include:
minutes
(e.g.60
=> 60 minutes)hours:minutes:seconds
(e.g.1:00:00
=> 60 minutes)days-hours
(e.g.2-12
=> two-and-a-half days)
For single-node jobs, up to 168 hours (7 days) can be requested. If that’s not enough, you can request access to the
longserial
queue for jobs of up to 336 hours (14 days).OSC bills you for the time your job actually used, not what you reserved. But jobs asking for more time may be queued longer before they start.
An example, asking for 2 hours in the “minute-format”:
#!/bin/bash
#SBATCH --time=120
Or for 12 hours in the “hour-format”:
#!/bin/bash
#SBATCH --time=12:00:00
It is common to be uncertain about how much time your job will take (i.e., how long it will take for your script to finish). Whenever this happens, ask for more, perhaps much more, time than what you think/guesstimate you will need. It is really annoying to have a job run out of time after several hours, while the increase in queueing time for jobs asking for more time is often quite minimal at OSC.
Exercise: exceed the time limit
Modify the sleep.sh
script to reserve only 1 minute for the job while making the script run for longer than that.
If you succeed in exceeding the time limit, an error message will be printed. Where do you think this error message will be printed: to the screen, in the Slurm log file, or in sleep.txt
? After waiting for the job to be killed after 60 seconds, check if you were correct and what the error message is.
Click for the solution
This script would do the trick, where we request 1 minute of wall-time while we let the script sleep for 80 seconds:
#!/bin/bash
#SBATCH --account=PAS2700
#SBATCH --time=1
echo "I will sleep for 80 seconds" > sleep.txt
sleep 80s
echo "I'm awake! Done with script sleep.sh"
Submit it as usual:
sbatch scripts/sleep.sh
Submitted batch job 23641567
This would result in the following type of error, which will be printed in the Slurm log file:
slurmstepd: error: *** JOB 23641567 ON p0133 CANCELLED AT 2024-04-04T14:55:24 DUE TO TIME LIMIT ***
3.3 Cores (& nodes and tasks)
There are several options to specify the number of nodes (≈ computers), cores, or “tasks” (processes). These are separate but related options, and this is where things can get confusing! Some background:
- Note that Slurm mostly uses the terms “core” and “CPU” interchangeably3. More generally with bioinformatics tools, “thread” is also commonly used interchangeably with core/CPU4. Therefore, as also mentioned in the session on OSC, for our purposes, you can think of core, CPU, and thread as synonyms that refer to the sub-parts/components of a node that you can reserve and use separately.
Running a program with multiple threads/cores/CPUs (“multi-threading”) is very common, and this can make the running time of such programs much shorter. While the specifics depend on the program, using 8-12 cores is often a sweet spot, whereas asking for even more cores can lead to rapidly diminishing returns.
Running multiple processes (tasks) or needing multiple nodes in a single batch job is not common.
In practice, my recommendations are to basically always:
Specify the number of threads/cores/CPUs to Slurm with
--cpus-per-task=n
(the short notation is-c
).Keep the number of tasks and nodes to their defaults of 1 (in which case the above
-c
option specifies the number of cores, period).Tell the program that you’re running about the number of available cores — most bioinformatics tools have an option like
--cores
or--threads
. You should set this to the same valuen
as the--cpus-per-task
.
An example, where we ask for 8 CPUs/cores/threads:
#!/bin/bash
#SBATCH --cpus-per-task=8
# And we tell a fictional program about that number of cores:
cool_program.py --cores 8 sampleA_R1.fastq.gz
You can specify the number of nodes with
--nodes
and the number of tasks with--ntasks
and/or--ntasks-per-node
; all have defaults of 1 (see the table below).Only ask for more than one node when a program is parallelized with e.g. “MPI”, which is rare in bioinformatics.
For jobs with multiple processes (tasks), you can use
--ntasks=n
or--ntasks-per-node=n
— this is also quite rare! However, note in practice, specifying the number of tasksn
with one of these options is equivalent to using--cpus-per-task=n
, in the sense that both ask forn
cores that can subsequently be used by a program in your script. Therefore, some people usetasks
as opposed tocpus
for multi-threading, and you can see this usage in the OSC documentation too. Yes, this is confusing!
Here is an overview of the options related to cores, tasks, and nodes:
Resource/use | short | long | default |
---|---|---|---|
Nr. of cores/CPUs/threads (per task) | -c 1 |
--cpus-per-task=1 |
1 |
Nr. of “tasks” (processes) | -n 1 |
--ntasks=1 |
1 |
Nr. of tasks per node | - | --ntasks-per-node=1 |
1 |
Nr. of nodes | -N 1 |
--nodes=1 |
1 |
3.4 --mem
: RAM memory
Use the --mem
option to specify the maximum amount of RAM (Random Access Memory) that your job can use:
Each core on a node has 4 GB of memory “on it”, and therefore, the default amount of memory you will get is 4 GB is per reserved core. For example, if you specify
--cpus-per-task=4
, you will have 16 GB of memory. And the default number of cores is 1, so the default amount of memory is 4 GB.Because it is common to ask for multiple cores and due to the above-mentioned adjustment of the memory based on the number of cores, you will usually end up having enough memory automatically — therefore, it is common to omit the
--mem
option.The default
--mem
unit is MB (MegaBytes); appendG
for GB (i.e.100
means 100 MB,10G
means 10 GB).Like with the time limit, your job gets killed by Slurm when it hits the memory limit.
The maximum amount of memory you can request on regular Pitzer compute nodes is 177 GB (and 117 GB on Owens). If you need more than that, you will need one of the specialized
largemem
orhugemem
nodes — switching to such a node can happen automatically based on your requested amount, though with caveats: see this OSC page for details on Pitzer, and this page for details on Owens.
For example, to request 20 GB of RAM:
#!/bin/bash
#SBATCH --mem=20G
Whereas you get a very clear Slurm error message when you hit the time limit (as seen in the exercise above), hitting the memory limit can result in a variety of errors.
But look for keywords such as “Killed”, “Out of Memory” / “OOM”, and “Core Dumped”, as well as actual “dumped cores” in your working dir (large files with names like core.<number>
, these can be deleted).
Exercise: Adjusting cores and memory
Think about submitting a shell script that runs a bioinformatics tool like FastQC as a batch job, in the following two scenarios:
- The program has an option
--threads
, and you want to set that to 8. The program also says you’ll need 25 GB of memory. What#SBATCH
options related to this will you use?
Click for the solution
You should only need the following, since this will give you 8 * 4 = 32 GB of memory. There is no point in “downgrading” the amount of memory.
#SBATCH --cpus-per-task=8
- The program has an option
--cores
, and you want to set that to 12. The program also says you’ll need 60 GB of memory. What#SBATCH
options will you use?
Click for the solution
Here, it will make sense to ask for --mem
separately.
#SBATCH --cpus-per-task=12
#SBATCH --mem=60G
3.5 --output
: Slurm log files
As we saw above, by default, all output from a script that would normally be printed to screen will end up in a Slurm log file when we submit the script as a batch job. This file will be created in the directory from which you submitted the script, and will be called slurm-<job-number>.out
, e.g. slurm-12431942.out
.
But it is possible to change the name of this file. For instance, it can be useful to include the name of the bioinformatics program that the script runs, so that it’s easier to recognize this file later. We can do this with the --output
option, e.g. --output=slurm-fastqc.out
if we were running FastQC.
But you’ll generally want to keep the batch job number in the file name too5. Since we won’t know the batch job number in advance, we need a trick here — and that is to use %j
, which represents the batch job number:
#!/bin/bash
#SBATCH --output=slurm-fastqc-%j.out
stdout
and stderr
, and separating them (Click to expand)
By default, two output streams from commands and programs called “standard output” (stdout
) and “standard error” (stderr
) are printed to screen. Without discussing this in detail, we have seen this several times: any regular output by a command is stdout
and any error messages we’ve seen were stderr
. Both of these streams by default also end up in the same Slurm log file, but it is possible to separate them into different files.
Because stderr
, as you might have guessed, often contains error messages, it could be useful to have those in a separate file. You can make that happen with the --error
option, e.g. --error=slurm-fastqc-%j.err
.
However, reality is more messy: some programs print their main output not to a file but to standard out, and their logging output, errors and regular messages alike, to standard error. Yet other programs use stdout
or stderr
for all messages.
I therefore usually only specify --output
, such that both streams end up in that file.
3.6 --mail-type
: Receive emails
You can use the --mail-type
option to have Slurm email you for example when a job begins, completes or fails. You don’t have to specify your email address: you’ll be automatically emailed on the email address that is linked to your OSC account. I tend to use:
FAIL
for shorter-running jobs (roughly up to a few hours)
FAIL
will email you upon job failure, e.g. when the scripts exits with an error or times out. This is especially useful when submitting many jobs with a loop: this way you know immediately whether any of the jobs failed.END
andFAIL
for longer-running jobs
This is helpful because you don’t want to have to keep checking in on jobs that run for many hours.
I would avoid having Slurm send you emails upon regular completion for shorter jobs, because you may get inundated with emails and then quickly start ignoring the emails altogether.
#!/bin/bash
#SBATCH --mail-type=END,FAIL
#!/bin/bash
#SBATCH --mail-type=FAIL
You may also find the values TIME_LIMIT_90
, TIME_LIMIT_80
, and TIME_LIMIT_50
useful for very long-running jobs, which will warn you when the job is at 90/80/50% of the time limit. For example, it is possible to email OSC to ask for an extension on individual jobs. You shouldn’t do this often, but if you have a job that ran for 6 days and it looks like it may time out, this may well be worth it.
Exercise: Submit your FastQC script as a batch job
Last week, we created a shell script to run FastQC, and ran it as follows:
fastq_file=../../garrigos_data/fastq/ERR10802863_R1.fastq.gz
bash scripts/fastqc.sh "$fastq_file" results/fastqc
- Add an
#SBATCH
lines to the script to specify the course’s OSC projectPAS2700
, and submit the modified script as a batch job with the same arguments as above.
Solution (click here)
The top of your script should read as follows:
#!/bin/bash #SBATCH --account=PAS2700
Submit the script as follows:
fastq_file=../../garrigos_data/fastq/ERR10802863_R1.fastq.gz sbatch scripts/fastqc.sh "$fastq_file" results/fastqc
Submitted batch job 12431988
Monitor your job with
squeue
.When it has finished, check the Slurm log file in your working dir and the main FastQC output files in
results/fastqc
.Bonus — add these
#SBATCH
options, then resubmit:- Let the Slurm log file include ‘fastqc’ in the file name as well as the job ID number.
- Let Slurm email you both when the job completes normally and when it fails. Check that you received the email.
Solution (click here)
The top of your script should read as follows:
#!/bin/bash #SBATCH --account=PAS2700 #SBATCH --output=slurm-fastqc-%j.out #SBATCH --mail-type=END,FAIL
4 In closing: making sure your jobs ran successfully
Here are some summarizing notes on the overall strategy to monitor your batch jobs:
To see whether your job(s) have started, check the queue (with
squeue
) or check for Slurm log files (withls
).Once the jobs are no longer listed in the queue, they will have finished: either successfully or because of an error.
When you’ve submitted many jobs that run the same script for different samples/files:
- Carefully read the full Slurm log file, and check other output files, for at least 1 one of the jobs.
- Check whether no jobs have failed: via email when using
--mail-type=END
, or by checking thetail
of each log for “Done with script” messages6. - Check that you have the expected number of output files and that no files have size zero (run
ls -lh
).
5 Self-study material
Slurm environment variables
Inside a shell script that will be submitted as a batch job, you can use a number of Slurm environment variables that will automatically be available, such as:
Variable | Corresponding option | Description |
---|---|---|
$SLURM_JOB_ID |
N/A | Job ID assigned by Slurm |
$SLURM_JOB_NAME |
--job-name |
Job name |
$SLURM_CPUS_PER_TASK |
-c / --cpus-per-task |
Number of CPUs (~ cores/threads) available |
$SLURM_MEM_PER_NODE |
--mem |
Amount of memory available (per node) |
$TMPDIR |
N/A | Path to the Compute storage available during the job |
$SLURM_SUBMIT_DIR |
N/A | Path to dir from which job was submitted. |
As an example of how these environment variables can be useful, the command below uses $SLURM_CPUS_PER_TASK
in its call to the program STAR inside the script:
STAR --runThreadN "$SLURM_CPUS_PER_TASK" --genomeDir ...
With this strategy, you will automatically use the correct (requested) number of cores, and don’t risk having a mismatch. Also, if you need to change the number of cores, you’ll only have to modify it in one place: in the resource request to Slurm.
5.1 Interactive shell jobs
Interactive shell jobs will grant you interactive shell access on a compute node. We’ve been working in a shell in VS Code Server, which means that we already have interactive shell access on a compute node!
However, we only have access to 1 core and 4 GB of memory in this VS Code shell, and there is no way of changing this. If you want an interactive shell job with more resources, you’ll have to start one with Slurm commands.
A couple of different commands can be used to start an interactive shell job. I prefer the general srun
command7, which we can use with --pty /bin/bash
added to get an interactive Bash shell.
srun --account=PAS2700 --pty /bin/bash
srun: job 12431932 queued and waiting for resources
srun: job 12431932 has been allocated resources
[...regular login info, such as quota, not shown...]
[jelmer@p0133 PAS2700]$
There we go! First some Slurm scheduling info was printed to screen: initially, the job was queued, and then it was “allocated resources”: that is, computing resources such as a compute node were reserved for the job. After that:
The job starts and because we’ve reserved an interactive shell job, a new Bash shell is initiated: for that reason, we get to see our regular login info once again.
We have now moved to the compute node at which our interactive job is running, so you should have a different
p
number in your prompt.
5.2 Table with sbatch
options
First, here are the options we discussed above:
Resource/use | short | long | default |
---|---|---|---|
Project to be billed | -A PAS2700 |
--account=PAS2700 |
N/A |
Time limit | -t 4:00:00 |
--time=4:00:00 |
1:00:00 |
Nr of nodes | -N 1 |
--nodes=1 |
1 |
Nr of cores | -c 1 |
--cpus-per-task=1 |
1 |
Nr of “tasks” (processes) | -n 1 |
--ntasks=1 |
1 |
Nr of tasks per node | - | --ntasks-per-node |
1 |
Memory limit per node | - | --mem=4G |
(4G) |
Log output file | -o |
--output=slurm-fastqc-%j.out |
|
Error output (stderr) | -e |
--error=slurm-fastqc-%j.err |
|
Get email when job starts, ends, fails, or all of the above |
- | --mail-type=START --mail-type=END --mail-type=FAIL --mail-type=ALL |
And a couple of additional ones:
Resource/use | option |
---|---|
Job name (displayed in the queue) | --job-name=fastqc |
Partition (=queue type) | --partition=longserial --partition=hugemem |
Let job begin only after a specific time | --begin=2024-04-05T12:00:00 |
Let job begin only after another job is done | --dependency=afterany:123456 |
Footnotes
This type of output is referred standard out (non-error output) and standard error — see the box in the section on Slurm log files for more↩︎
Unless you actively “Delete” to job on the Ondemand website.↩︎
Even though technically, one CPU often contains multiple cores.↩︎
Even though technically, one core often contains multiple threads.↩︎
For instance, we might be running the FastQC script multiple times, and otherwise those would all have the same name and be overwritten.↩︎
The combination of using strict Bash settings (
set -euo pipefail
) and printing a line that marks the end of the script (echo "Done with script"
) makes it easy to spot scripts that failed, because they won’t have that marker line at the end of the Slurm log file.↩︎Other options:
salloc
works almost identically tosrun
, whereassinteractive
is an OSC convenience wrapper but with more limited options.↩︎