Broadly speaking, a compute cluster is a collection of computers (nodes). Our clusters are only accessible remotely. You will primarily connect to a computer that we call a login node. From this computer, you will be able to view your files and dispatch jobs to one or several other computers across the cluster that we call compute nodes. The tool we use to submit these jobs is called a job scheduler. You should NOT compile programs or run jobs on the login node. Detailed information about each of our clusters is available here.
Request an Account
The first step in gaining access to our clusters is requesting an account. There are several HPC clusters available at Yale. There is no charge for using these clusters. To understand which cluster is appropriate for you and to request an account, visit the account request page.
Hands on Training
We offer several courses that will assist you with your work on our clusters. They range from orientation for absolute beginners to advanced topics on application-specific optimization. Please peruse the catalog here.
Rules of the road
Before you begin using the cluster, here are some important guidelines:
- Do not run jobs or do real work on the login node. Always allocate a compute node and run programs there.
- Never give your password or ssh key to anyone else.
- Do not store any protected or regulated data on the cluster (e.g. PHI data)
- Clean up after yourself by releasing unused jobs and removing unneeded files.
- Use scratch and project space for large and/or numerous files, rather than using your home directory.
- Do your best to understand your program's requirements, especially its RAM and disk usage needs. Try not to overload the nodes' RAM or disk capabilities.
If you are uncertain about any of the above, please ask by emailing to firstname.lastname@example.org
All of Yale's clusters are accessed via a protocol called secure shell (ssh). You can use ssh directly, or via a graphical ssh tool. The details vary depending on the operating system of the computer in front of you. If you want to access the clusters from outside Yale, you must use the Yale VPN.
For specifics on ssh and how to connect to the clusters with your application and operating system of choice, please see our documentation on Accessing the Clusters.
Move Your Files
You will likely find it necessary to copy files between your local machines and the clusters. Just as with logging in, there are different ways to do this, depending on your local operating system. See the documentation on transferring files for more information.
To best serve the diverse needs of all the software that a scientist needs in an HPC environment, we use a module system to manage software. This allows you to swap between different application and versions of those applications with relative ease and focus on getting your work done, not compiling software. See the Modules documentation in our User Guide for more information. If you find software that you'd like to use that isn't available, feel free to drop us a line about it at email@example.com .
Schedule a Job
On our clusters, you control your jobs using a job scheduling system that dedicates and manages compute resources for you. We currently have several job schedulers on our clusters, though we are transitioning to using just one. All are similar in spirit but differ slightly in detail. They are usually used in one of two ways. For testing and small jobs you may want to run a job interactively. This way you can directly interact with the compute node(s) in real time to make sure your jobs will behave as expected. The other way, which is the preferred way for large and long-running jobs, involves writing your job commands in a script and submitting that to the job scheduler. Please see below for cluster-specific documentation on each of these methods:
Almost all our clusters use Slurm
Omega still uses Torque
New to Unix?
A basically familiarity with Unix commands is required for interacting with the clusters. There are many excellent beginner tutorials available for free online, including the following.