Nandadevi Job Scheduler (Slurm)

Nandadevi cluster has been partitioned into 8 partitions/queues using the latest opensource version of SLURM job scheduler. The details are as under.

SlNo	Queue Name	No of Cores	Purpose
1	*nandaq*	20 nodes,28 cores pernode, 560 cores cluster (23 TF)	For Parallel jobs using MPI
2	nandaknlq	4 nodes with Intel Xeon PHI 7230 Processor, 64 cores and 64GB RAM pernode, Total 256 cores, (10 TF)	For OpenMP,Manycore and Parallel Jobs
3	*nandasq*	34 nodes,16 cores pernode, 544 cores with 64 GB memory (14 TF)	For multiple serial jobs and OpenMP applications
4	*nandagpuq*	8 nodes having 16 Nvidia K20 GPU	For CUDA and OpenACC applications
5	nandaitraq	2 nodes, 40 cores	Used by the ITRA project members
6	nandaphenoq	2 nodes, 40 cores	Used by the PHENO project members
7	nandaifcq	3 nodes with 10 Nividia K80 GPU	User by the IFC project members

New Sample slurm file for submitting jobs to any one of the above queues. Download for MPI jobs, OMP jobs,Serial jobs,Check-Pointing

Do the necessary changes in this sample files and use it.

Updated Slurm Jobscript (Feb2021).

#!/bin/bash

#SBATCH -N 1

#SBATCH --job-name=<Your_jobname>

##Available Partitions

##nandaq(OR)nandaknlq(OR)nandasq(OR)nandagpuq(OR)nandaphenoq(OR)nandaitraq(OR)nandaifcq

#SBATCH --partition=nandaq

#SBATCH --output=Job.%j.out

#SBATCH --error=Job.%j.err

#SBATCH --export=all

#SBATCH --mail-user=<username>@imsc.res.in

#SBATCH --mail-type=ALL

#SBATCH -D </Working_dir_path usually /lustre/username/...>

# Load your modules in the script.

module load module_name

module unload module_name

MACHINE_FILE=nodes.$SLURM_JOBID

scontrol show hostname $SLURM_JOB_NODELIST > $MACHINE_FILE

export OMP_NUM_THREADS=$SLURM_NTASKS

srun <your executable> >& out_$SLURM_JOBID

##MPI Case

### Only One Exectable allowed

#mpirun -np <npvalue> <your/executable/with/path> >& out_$SLURM_JOBID

wait

Note: Please give a full path of the executable file, if the executable file are in different location
Ex: srun /home/username/proj1/a.out >& out_$SLURM_JOBID

Sample slurm file for submitting jobs to any one of the above queues. Download for MPI jobs, OMP jobs,Serial jobs,Check-Pointing

OLD JOBSCRIPT

#!/bin/bash
#SBATCH -N 1
#SBATCH --ntasks-per-node=28
#SBATCH -J <testrun>
# Available Partition names/Queuename are: 1) nandaq, 2) nandaknlq, 2)nandasq, 3)nandagpuq, 4)nandaphiq
#SBATCH -p <partitionname>
#SBATCH --export=all
#SBATCH --mail-user=<username>@imsc.res.in
#SBATCH --mail-type=ALL

cd $SLURM_SUBMIT_DIR
echo $SLURM_JOB_NODELIST > hostfile_$SLURM_JOBID

module load intel/2017
##OpenMP Case
## Make sure the ppn value and OMP_NUM_THREADS value are same or
## leave with SLURM_NTASKS env variable
### Only One Exectable allowed
export OMP_NUM_THREADS=$SLURM_NTASKS
<your executable> >& out_$SLURM_JOBID

##MPI Case
### Only One Exectable allowed
#mpirun -np <npvalue> <your/executable/with/path> >& out_$SLURM_JOBID

Job Submission

Create a pbs script using the above sample with suitable modification. Submit the job using the following command

sbatch slurm_script.sh

Once the queue is accepted by torque it will return a JOBID and that can be used for finding status of the job OR deleting the job

Job Status

The following command will display the status of the job.

squeue -a -n -1 [job id]

It will display the status of all the jobs. The output will have Job ID, Username, Queue, Jobname SessID, NDS, TSK, S, Elap Time.

Single letter under the column 'S' will give you about the state of the job in queue. Details of each letter is given below

Q - job is queued, eligible to run or routed.

R - job is running.

E - Job is exiting after having run.

H - Job is held.

T - job is being moved to new location.

W - job is waiting for its execution time

Job Deletion

The following command will delete a job from the queue

scancel <jobid>

PBS command like, qstat, qsub, qel also works

High Performance Computing @ IMSc

Main menu

Nandadevi Job Scheduler (Slurm)

Job Submission

Job Status

Job Deletion

Nandadevi

High Performance Computing @ IMSc

Search form

Main menu

Nandadevi Job Scheduler (Slurm)

Job Submission

Job Status

Job Deletion

Nandadevi

User login

You are here