KevCaz's Website

In a previous note, I’ve narrated my transition from Mammoth to Graham and I’ve exemplified how to submit an job with Slurm. As I’m currently using Julia for several projects, I’d like to report how I submitted my Julia script to the scheduler.

First of all, I needed to set up Julia for my account. As Graham runs under CentOS and as I had already loaded Julia v1.3.11, I just had to update the version of Julia with module.

$ module load julia/1.4.0
$ module save

Then I installed all the required packages for my project (note that by default there are stored in ~/.julia/packages/). Then I wrote a small bash script to specify my needs to the scheduler and to start my script. Once I had checked the node characteristics on Graham and then I wrote the following bash script

#SBATCH --account=myaccount
#SBATCH --time=6:00:00
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --mem-per-cpu=1024M
#SBATCH --cpus-per-task=32

julia -p 32 myscript.jl

and save it as Briefly, the #SBATCH directives specify that I require 32 CPUs on 1 node, with 1Go per CPU, for a single task that will run for a maximum time of 6 hours. The last line runs my Julia script myscript.jl (note that julia -p 32 indicates that Julia can use up to 32 CPUs). Also, if needed, you can easily pass arguments to the script like julia -p 32 myscript.jl arg1, that will be collected and stored in ARGS you can use in your script2.

While my job was running, I have been looking for ways to monitor it. Compute Canada’s wiki lists several helpful commands3 to do so, I tried others, such as sstat but looks like something when a job is running is does not report information pertaining to the job properly4. So while the job is running I use either squeue or scontrol, like so

$ squeue -u <username>
$ scontrol show job -dd <job_ID>

Once my job was completed, I used seff <job_ID>5 to scrutinize how the resources were actually utilized; it turned out I did not need that many CPUs, nor that much memory!

That’s all folks!

  1. module saved are listed in .lmod.d/default↩︎

  2. see↩︎

  3. see also for a set of convenient Slurm commands. ↩︎

  4. and unfortunately I cannot ssh to the node so use something like htop↩︎

  5. sacct -j <job_ID> is also a good option. ↩︎