General Lab Information

Batch Jobs on HPC1

The Torque Resource Manager and Moab Workload Manager are used for submitting and scheduling batch jobs
respectively.

Batch Commands
Batch Queues
Torque and Moab Documentation
Back to HPC1 Documentation

Batch Commands:

qsub [flags] [pbs_script] - Submit PBS batch script named pbs_script

qstat [flags] - Show status of PBS batch jobs

qdel [flags] jobid - Delete PBS job; use qstat to find jobid and only provide the number

qstat -Q - To see batch queues on HPC1

qmgr -c 'p s' - Print batch queue and server attributes

pbsnodes -a|more - See PBS status and other info about hpc1 nodes; state field indicates whether node is free (available), job-exclusive (in use by a job), offline .

There is a man page for each of these on hpc1.

Examples:

[hpc1 ]$ qsub -q phidev batch.job- Submit batch job to queue phidev
2988.hpc1.csc.bnl.local

[hpc1 ]$ qstat -f 2988.hpc1.csc.bnl.local
Job_Name = mpitest
Job_Owner = slatest@hpc1.csc.bnl.local
(much more info including e.g.:
exec_host = node15/0+node15/1+node15/2)

[hpc1]$ qdel 2988 - Delete batch job

[hpc1]$ qstat --version - To see Torque version
 Version 4.2.6.1

Top of Page
 

Batch Queues

The default (if you don't specify the -q flag on qsub)  and only batch queue at the moment  is named batch .
Four additional queues are planned for the not too distant future, listed below.
The current queue will then be mapped into one of the four batch queues below; this page will then be updated to reflect that.

The following queues are not available yet,  but will be in the not too distant future:

gpudev   -For running development batch jobs on the GPU

gpuprod -For running production batch jobs on the GPU

phidev   - For running development batch jobs on the Intel PHI co-processor

phiprod  -For running production batch jobs on the Intel PHI co-processor

The "dev" queues above will have a maximum wall clock time of 30 minutes, and a maximum 5 jobs per user
simultaneously.
The "prod" queues above will have a wall clock range of 30 minutes 1 second (00:30:01) minimum to 6 hours
maximum, with a maximum of 2 jobs per user simultaneously. The maximum number of processing elements (PE's
i.e. cores) is 64, which means 4 nodes since there are 16 cores per node.

node01, node02, ..., node08 can be used for GPU jobs, and node09, node10, node12, ... node16 for Intel PHI jobs.
Note that node11 is not available, it is reserved for CSC Code Center staff.

Jobs cannot span across GPU and Intel PHI nodes. They must run on one or the other.

The command to see the batch queues defined on the system is:
[hpc1]$  qstat -Q  

The output from the following command includes information about the attributes of each queue:
[hpc1]$ qmgr -c 'p s'

Top of Page

Torque and Moab Documentation:

Torque Resource Manager Administrator Guide and Moab Workload Manager Administrator Guide at Adaptive Computing

Top of Page

Back to HPC1 Documentation