Milgram - Scheduled Maintenance

Scheduled maintenance will be performed on Milgram beginning Monday, December 10, 2018, at 8:00 am. Maintenance is expected to be completed by the end of day, Wednesday, December 12, 2018. During this time, logins will be disabled, and Milgram’s storage will not be available. An email notification will be sent when the maintenance has been completed, and the cluster is available. Aside from the system and security updates we perform during maintenance, we want you to know about the following changes:
 
Software Changes on Milgram: The software available via the modules system on Milgram is being upgraded to be more consistent with our other clusters. During the December maintenance, we will change the default module list to a new module collection. When the cluster returns, please let us know if any software you need is missing. The old installations will remain for the time being, but all new software will be installed into the new collection. The old installations can be accessed by running the following:
 
source /apps/bin/old_modules.sh
 
More information about this transition is available on our website at https://research.computing.yale.edu/support/hpc/user-manual/software-collection-upgrade
 
Scratch Changes on Milgram: The “scratch” space on Milgram will be officially renamed to “scratch60” to be consistent with the other clusters. The path “/gpfs/milgram/scratch” will become “/gpfs/milgram/scratch60”. To prevent immediate job failures, we will create a symlink to the old path. We will deprecate the symlink at a later date, so after the maintenance window please update your paths accordingly.
 
As the maintenance window approaches, the Slurm scheduler will not start any job if the job’s requested wallclock time extends past the start of the maintenance period (8:00 am on December 10, 2018). If you run squeue, such jobs will show as pending jobs with the reason “ReqNodeNotAvail”. (If your job can actually be completed in less time than you requested, you may be able to avoid this by making sure that you request the appropriate time limit using “-t” or “–time”.) Held jobs will automatically return to active status after the maintenance period, at which time they will run in normal priority order.
 
If you have questions, comments, or concerns, please contact us at hpc@yale.edu.