Another note about how I work with Slurm! Today, I comment on three strategies to run simulations for which the values taken by two arguments (or more) vary from one simulation to another.
The problem
Let’s assume that we run simulations by calling a
Julia script myscript.jl
that takes p1
as
argument, in order to run the simulation on one
CPU for a single
simulation, I would run the following bas command
|
|
where I would replace p1
by the right value. To run more than one simulation,
say 3 for the sake of example, with Slurm I would write a short bash script and use a job array (see
more details in this note) be it for simple sequence, e.g.1:3
,
|
|
or for any set of IDs, e.g. 1, 5, 6
:
|
|
Now let’s assume myscript.jl
takes not one but two parameters, namely p1
and p2
, the bash command becomes
|
|
I further assume that p1
takes the 1, 5, 6
and p2
takes 1:5
and that I
am interested in running simulations for all combinations (15 simulations in
total). There are various strategies to deal with this scenario and below I
discuss three of such strategies.
Using GNU parallel
One way is to work with GNU
parallel1 which could also be
combined with Slurm job array designed (see this previous
note). Basically, one needs to allocate the right
number of CPU and to use parallel
properly. For the example described above, I
could use Slurm array job for p1
and parallel
for p2
(and conversely).
|
|
Note #SBATCH --cpus-per-task=5
allocates 5 CPUs per job (1, 5 and
6, so 15 CPUs total). Also, --delay 2
is used to avoid thundering herd
problem (see the
documentation). Note that I can use parallel for any extra parameters, for instance, assuming there is a third parameter that takes 10, 22, 51, 100
, I would write
|
|
Actually, one can stick to parallel
and ignore Slurm job array but it is
slightly more complicated when one needs to deal with more than one node,
although it is totally doable and well-explained on Compute Canada’s
Wiki.
Additional bash lines
A strategy I recently used to write additional bash lines to make two bash
variables out of $SLURM_ARRAY_TASK_ID
. For the example we are discussing, I
would write the following bash script
|
|
Note that above I introduce one bash
array and
${vals[@]:((($SLURM_ARRAY_TASK_ID - 1) / 5)):1}
is used to extract 1
element
starting at (($SLURM_ARRAY_TASK_ID - 1) / 5))
(note that index starts at 0 for
bash arrays). This may be very helpful for people that are already confortable
with bash scripts.
Stick to one parameter and deal with it directly in your script!
Last but not least, I can stick to one parameter and make it two, directly in my
script. This is more a way around than a solution of the problem we described
earlier, but this may very well be the easiest solution in many cases as it simply requires to add extra lines in the programming langage the script was coded in. Here I would add the following lines in myscript.jl
|
|
and then simply declare 15 jobs:
|
|
relatively easy and very efficient! Note that when dealing with more than 5
parameters, the first two solutions might prove more tedious whereas you may
feel more confortable dealing with it in a programming language you know better.
For instance, it is likely that you would be able to create a function that
would generate the proper set of parameter based on a single parameter.
Alternatively, you could create an external file (may be with your favorite
programming langage) and then use $SLURM_ARRAY_TASK_ID
to indicate the line of
the file to be read for a given simulation.
obviously, this should be installed first. ↩︎