GPUs and CUDA

We currently have GPUs available for general use on Omega, Grace and Farnam. See cluster pages for hardware and queue/partition specifics.

Accessing the GPU Nodes

To access the GPU nodes you must request them with the scheduler. On Omega, you need to submit your job to the right gpu queue. Here is a sample request for an interactive session on 1 node:

omega$ qsub -I -l nodes=1:ppn=16,mem=120gb,walltime=24:00:00 -q gputest 

On all other clusters, you'll use Slurm to request gpu resources. See cluster-specific documentation for more info about which partitions contain nodes with GPUs. An example Slurm command to request and interactive job on the gpu partition with X forwarding and 1/2 of a GPU node (10 cores and 1 K80) on Farnam would be:

srun --pty --x11 -p gpu -c 10 -t 24:00:00 --gres=gpu:2 --gres-flags=enforce-binding bash

The --gres=gpu:2 option asks for two gpus, and the --gres-flags=enforce-binding option ensures you get two GPUs on the same card, and that the CPUs you are allocated are on the same bus as your GPU.

Please do not use nodes with GPUs unless your application or job can make use of them.

You can check the available GPUs and their current usage with the command nvidia-smi

Software

CUDA and cuDNN are available as modules where applicable. On your cluster of choice use toolkit is installed on the GPU nodes. To see what is available, type:

omega$ modulefind cuda
elsewhere$ module avail cuda

If you wish to request additional packages, please email hpc@yale.edu