Omega Scheduled Maintenance

Scheduled maintenance will be performed on Omega beginning Monday, April 2nd, 2018, at 8:00 am.  Maintenance is expected to be completed by the end of this week.   During this time, logins will be disabled and Omega’s storage will not be available.  An email notification will be sent when the maintenance has been completed, and the cluster is available.

As the maintenance window approaches, the Slurm scheduler will not start any job if the job’s requested wallclock time extends past the start of the maintenance period (8:00am on April 2, 2018). If you run squeue, such jobs will show as pending jobs with the reason “ReqNodeNotAvail”. (If your job can actually be completed in less time than you requested, you may be able to avoid this by making sure that you request the appropriate time limit using “-t” or “–time”.) Held jobs will automatically return to active status after the maintenance period, at which time they will run in normal priority order.

The primary purpose of the maintenance period is to finish moving all data off of the Lustre storage system onto the Loomis GPFS storage system. After the maintenance period is complete, current Omega files will be stored solely on the Loomis GPFS system. For groups that have migrated their workloads entirely to Grace or Farnam, their Omega data will be available for copying and clean-up until December 2018 at

/gpfs/loomis/home.omega/<metagroup>/<group>
/gpfs/loomis/scratch.omega/<metagroup>/<group>

In addition to the above access from Grace, the remaining groups on Omega will still be able to access their storage on Omega at any preexisting paths until the cluster is fully decommissioned in December 2018. These groups will be contacted shortly with more details about the decommission process.

If you have questions, comments, or concerns, please contact us at hpc@yale.edu.