Computing Systems Status
Reduced Capacity on McCleary - June 1-2 2026:
To address a cooling issue in one of the McCleary racks at West Campus, we need to temporarily take down the McCleary nodes that have names beginning with r813 (except for r813u29n09 and r813u29n11).
The work will begin at 8:00am on Monday, June 1, 2026, and will take up to two days.
This impacts the following:
- 26 of the 33 nodes in the day partition
- 14 of the 16 nodes in the week partition
- 1 of the 20 nodes in the gpu partition
- 1 of the 3 nodes in the pi_gerstein_gpu partition
- 2 of the 10 nodes in the pi_jetz partition
- 4 of the 4 nodes in the pi_ohern partition
- 2 of the 2 nodes in the pi_sestan partition
- 4 of the 4 nodes in the pi_tsang partition
Slurm jobs can be submitted as usual, but any jobs whose requested wallclock time extends past the start of the maintenance period and that would need to start on the affected nodes won’t start until after the work is complete.
We realize this is an impact to your work and we apologize for the inconvenience. We will work with the vendors to complete the work as quickly as possible.
If you have any questions, comments, or concerns, please contact us at research.computing@yale.edu.
Resolved past issues:
5/11/2026: There was a power drop at the West Campus Data Center on the morning of 5/11/2026. This impacted running jobs on McCleary, Grace, Misha and Milgram. Please check your jobs to see how they were impacted.
2/13/2026: Bouchet scheduler issues have been resolved. All systems are operational.