Grace

About

Grace Hopper and UNIVAC

Grace is a shared-use resource for the Faculty of Arts and Sciences (FAS). The cluster is named for the computer scientist and United States Navy Rear Admiral Grace Murray Hopper, who received her Ph.D. in Mathematics from Yale in 1934.

Migration from Omega to Grace

If you are moving to Grace due to the upcoming Omega decommission, there are a few things to know. The key differences between Omega and Grace are:

  • Grace uses the Slurm scheduler and you will need to translate your submission scripts accordingly.
  • All the queues (called "partitions" in Slurm) on Grace are "shared". This means that unless you request exclusive node access when submitting jobs, multiple jobs from different users may run on a single node. This also means that your jobs that don't use shared memory parallelism will start sooner since they can fit on cores scattered across a large number of nodes.
  • If you used SimpleQueue on Omega, please look at our documentation for the improved Dead Simple Queue for Slurm.
  • Grace has a "scavenge" queue that allows you to run on unused private resources if the public queues are very oversubscribed. See below for details.
  • In addition to your "home" and "scratch" (called "project" on Grace) storage spaces, there is an additional storage space called "scratch60". See below for details.

Cleaning Out Omega Data

All Omega files are now stored solely on the Loomis GPFS system. For groups that have migrated their workloads entirely to Grace or Farnam, their Omega data is now available from Grace and Farnam for copying and clean-up until December 2018. See Cleaning Out Omega Data for instructions on retrieving your data.

Logging in

If you are a first time user, make sure to read the pertinent links from our user guide about using ssh. Once you have submitted a copy of your public key to us, you should be able to ssh to grace.hpc.yale.edu. As with the other Yale clusters, there are two login nodes; you will be randomly placed on one of them.

Partitions

Grace uses the Slurm job scheduler. Unless users request exclusive node access when submitting jobs, multiple jobs from different users may run on a single node. To facilitate this, the scheduler will strictly enforce memory limits to ensure that all jobs have access to the memory requested for them. To see more details about how jobs are scheduled see our Job Scheduling documentation.

All partitions on Grace have a default walltime limit of 1 hour. Use the -t HH:MM:SS flag to request additional time up to the limits listed below. Similarly, they have a default memory limit of 5GB per requested core. If you run into insufficient memory errors, use the --mem-per-cpu flag to increase your job's memory limit. Slurm documentation for more details on requesting computing resources and submitting jobs.

Common Partitions

name max resources* max walltime/job nodes (hostname)
user group** E5-2660V2
(c01-04, c06-08)
E5-2660V3
(c05, c08-22)
E5-2660V4
(c23-37)
E7-4820V4
(bigmem)
interactive*** 4 c 6 hours 2
day 640 c 900 c 24 hours 57 34 72
week 100 c 250 c 7 days 48 6
gpu 6 n 24 hours 6 (2xK80) 6 (1xP100)
bigmem 40 c 24 hours 2
scavenge 6400 c 24 hours all all all all

* "c" = cores or core equivalents (5GB of memory = 1 cores), "n" = nodes

** if your group hits this limit, you will see jobs pending with the reason "MaxCpuPerAccount"

*** interactive jobs; jobs to compile/debug/test programs; etc. (limited to one batch or interactive job at a time per user)

See the "Hardware" section below for the flag to request specific node types.

Dedicated Partitions

name max cores/user max walltime/job nodes (hostname)
E5-2660V2
(c01-04, c06-08)
E5-2660V3
(c05, c08-22)
E5-2660V4
(c23-36)
E7-4820V4
(bigmem)
pi_altonji 28 days 2
pi_anticevic 100 days 17 20
pi_anticevic_bigmem 100 days 1
pi_anticevic_fs 100 days 3
pi_anticevic_gpu 100 days 8 (2xK80)
pi_balou 28 days 30
pi_berry 28 days 1
pi_cowles 120 28 days 14
pi_cowles_nopreempt 120 28 days 10
pi_gelernter 28 days 1
pi_gerstein 28 days 32 1
pi_hammes_schiffer 28 days 4
pi_holland 28 days 2
pi_jetz 28 days 2
pi_kaminski 28 days 8
pi_legewie 28 days 1
pi_mak 28 days 3
pi_manohar 180 days 8, 1xP100 2
pi_ohern 28 days 16 3
pi_owen_miller 28 days 5
pi_poland 28 days 10
pi_tsmith 28 days 1

Scavenge Partition

A scavenge partition is available on Grace. It allows you to run jobs outside of your normal fairshare restriction and makes use of any unutilized cores that may be available in any partition on the cluster. However, any job running in the scavenge partition is subject to preemption if any node in use by the job is required for a job in the node's normal partition. This means that your job may be killed without advance notice, so you should only run jobs in the scavenge partition that either have good checkpoint capabilities or that can be restarted with minimal loss of progress. For this reason, keep in mind that some jobs are not good fits for the scavenge partition, such as jobs with long startup times or jobs that run a long time between checkpoint operations.

Software

Grace uses the modules system for managing software and its dependencies. See our documentation on modules here.

Compute Hardware

Node Type Processor (--constraint tag*) Speed Cores Available RAM
IBM NeXtScale nx360 M4 Intel Xeon E5-2660 V2 (ivybridge) 2.20GHz 20 120GB
Lenovo NeXtScale nx360 M5 Intel Xeon E5-2660 V3 (haswell) 2.60GHz 20 120GB
Lenovo NeXtScale nx360 M5 Intel Xeon E5-2660 V4 (broadwell) 2.00GHz 28 250GB

*For more info on how to use constraint, please see the Slurm Documentation.

GPUs

Grace has 12 nodes on the gpu partition. Six have 2 Nvidia Tesla K80s each (each K80 has 2 GPUs, for a total of 4 GPUs per node). Six have one Nvidia Tesla P100 each. See our GPU guide for instruction on requesting GPUs for your job.

Storage

File System: 2 PB of GPFS storage via FDR InfiniBand

By default, each group has 300 GB and a 1TB storage quota on the home and project partitions, respectively. A group's usage can be monitored using the groupquota script available at:

/gpfs/apps/bin/groupquota

Storage


Each PI group is provided with storage space for research data on the HPC clusters. The storage is separated into three tiers: home, project, and scratch. You can monitor your storage usage by running the "groupquota" command on the cluster.

Please note: the only storage backed up on every cluster is /home

Home

Home storage is a small amount of space to store your scripts, notes, final products (e.g. figures), etc. Home storage is backed up daily.

Project

In general, project storage is intended to be the primary storage location for HPC research data in active use.

60-Day Scratch (scratch60)

Use this space to keep intermediate files that can be regenerated/reconstituted if necessary. Files older than 60 days will be deleted automatically. This space is not backed up, and you may be asked to delete files younger than 60 days old if this space fills up.

Other Storage Options

If you or your group finds these quotas don't accommodate your needs, please see the off-cluster research data storage options.

Contact us at hpc@yale.edu about purchasing cluster storage.