Grace

Note: Between now and the next regularly scheduled maintenance window for the Grace cluster (June 12-14, 2017), we will be upgrading the cluster operating system to RHEL 7.2 and deploying the Slurm job scheduler on the cluster. The grace-next cluster has been created to support this transition. Visit the grace-next page for more information.

About

Grace Hopper and UNIVAC

Grace is a shared-use resource for the Faculty of Arts and Sciences (FAS). The cluster is named for the computer scientist and United States Navy Rear Admiral Grace Murray Hopper, who received her Ph.D. in Mathematics from Yale in 1934.

Logging in

If you are a first time user, make sure to read the pertinent links from our user guide about using ssh. Once you have submitted a copy of your public key to us, you should be able to ssh to grace.hpc.yale.edu. As with the other Yale clusters, there are two login nodes; you will be randomly placed on one of them.

Queues and Scheduler

Grace uses IBM's Platform LSF tool for workload management and job scheduling. Note: All nodes on Grace are shared, where multiple jobs will run on a single node if they request less than the total number of cores on that node.

name max cores/job max walltime/job notes
interactive 4 6 hours compiling/debugging/testing programs
shared 640 24 hours 32 node limit per job
week 100 7 days 5 node limit per job
long 20 28 days 1 node limit per job

Software

Grace uses the modules system for managing software and it's dependencies. See our documentation on modules here.

Compute Hardware

Node Type Processor Speed Cores RAM
IBM NeXtScale nx360 M4 Intel Xeon E5-2660 V2 (Ivy Bridge) 2.20GHz 20 128G
Lenovo NeXtScale nx360 M5 Intel Xeon E5-2660 V3 (Haswell) 2.50GHz 20 128G

Storage

File System: 2 PB of GPFS storage via FDR InfiniBand

By default, each group has 300 GB and a 1TB storage quota on the home and project partitions, respectively. A group's usage can be monitored using the groupquota.sh script available at:

/gpfs/apps/bin/groupquota.sh

Storage


Each PI group is provided with storage space for research data on the HPC clusters. The storage is separated into three tiers: home, project, and temporary scratch.

Home

Home storage is designed for reliability, rather than performance. Do not use this space for routine computation. Use this space to store your scripts, notes, etc. Home storage is backed up daily.

Project

In general, project storage is intended to be the primary storage location for HPC research data in active use. Project storage is not backed up.

60-Day Scratch (scratch60)

This temporary storage should typically give you the best performance. Files older than 60 days will be deleted automatically. This space is not backed up, and you may be asked to delete files younger than 60 days old if this space fills up.

Other Storage Options

If you or your group finds these quotas don't accommodate your needs, contact us at hpc@yale.edu.

You can also mount Storage@Yale, which is a service offered by Yale ITS to University members. Note that S@Y mounted on a cluster will not be available to be mounted elsewhere. To request S@Y mounted on the clusters, fill out our S@Y Request Form.