SLURM Multi Program Usage
The --multi-prog option in srun allows you to assign each parallel task in your job a different option.
Note on multi-threaded code (e.g. with MKL support)
If you have compiled your code with a numeric library support (e.g. MKL), please note that calls to that library may well be threaded by default (e.g. MKL function calls are threaded by default).
In this case, to avoid your "multi-prog" task farming codes (as directed below) to overwhelm each node, you must add the following to your submission script:
export OMP_NUM_THREADS=1
This should be put in the script before the "srun..." line.
Usage with sbatch
Create your submission script with the basic details. For example call it job.sh
#!/bin/sh #SBATCH -n 16 # 16 cores #SBATCH -t 1-03:00:00 # 1 day and 3 hours #SBATCH -p compute # partition name #SBATCH -U chemistry # your project name - contact Ops if unsure what this is #SBATCH -J my_job_name # sensible name for the job srun --multi-prog test.configThe file
test.config contains the parameters required by the multi-prog option.
Configuration File
The configuration file contains three fields, separated by blanks. These fields are :- Task number
- Executable File
- Argument
- %t - The task number of the responsible task
- %o - The task offset (task's relative position in the task range).
0-3 hostname 4,5 echo task:%t 6 echo task:%t-%o 7 echo task:%o 8-15 hostname
Note re PATH to executable
Please note that if you're using a custom executable, you should supply the full PATH to the file.
For example:
0 /home/users/jbloggs/bin/my_bin input1 1 /home/users/jbloggs/bin/my_bin input2 ...
Submission of the job.sh script
Submit it as a normal sbatch job file:
[neil@lonsdale01 ~]$ sbatch job.sh
The output will be something like:
4: task:4 5: task:5 6: task:6-0 7: task:0 2: node128 8: node129 0: node128 1: node128 3: node128 10: node129 9: node129 11: node129 12: node129 15: node129 13: node129 14: node129
Usage with srun
srun -n 8 -pThe file 'test.config' contains the parameters required by the multi-prog option.-U -t 00:10:00 --multi-prog test.config
Configuration File
The configuration file contains three fields, separated by blanks. These fields are :- Task number
- Executable File
- Argument
- %t - The task number of the responsible task
- %o - The task offset (task's relative position in the task range).
0-3 hostname 4,5 echo task:%t 6 echo task:%t-%o 7 echo task:%o
Output
[neil@lonsdale01 ~]$ srun -l -n 8 -p compute -t 00:10:00 --multi-prog test.config srun: Job is in held state, pending scheduler release srun: job 9238 queued and waiting for resources srun: job 9238 has been allocated resources 4: task:4 5: task:5 6: task:6-0 7: task:0 2: node128 0: node128 1: node128 3: node128 [neil@lonsdale01 ~]