Omega Unscheduled Outage

Thursday, July 16, 2015 - 5:15am to 7:30am

At approximately 5:15am today, a circuit breaker tripped impacting power to about half of the nodes on the Omega cluster. Power was restored to those nodes at approximately 7:30am.

The root cause remains under investigation by Data Center Engineering.

The following jobs were impacted:

compute-45-2 Down  cpu 0:16   load   jobname=flvc36mrun2.pbs   user=mjr92 q=gputest
compute-45-3 Down  cpu 0:16   load   jobname=flvc36mrun2.pbs   user=mjr92 q=gputest
compute-45-4 Down  cpu 0:16   load   jobname=flvc36mrun2.pbs   user=mjr92 q=gputest
compute-46-1 Down  cpu 0:12   load   jobname=flvc36mrun2.pbs   user=mjr92 q=gputest
compute-46-2 Down  cpu 0:12   load   jobname=flvc36mrun2.pbs   user=mjr92 q=gputest
compute-46-4 Down  cpu 0:12   load  jobnum=4803721 jobname=...Se-3_b3lypecp   user=wd89 q=esi
compute-46-5 Down  cpu 0:12   load  jobnum=4803705 jobname=2_CdSe-3_pw91ecp   user=wd89 q=esi
compute-46-8 Down  cpu 0:12   load  jobnum=4803703 jobname=2_CdSe-3_m06lecp   user=wd89 q=esi
compute-47-1 Down  cpu 0:12   load   jobname=2_CdSe-3_m06lecp   user=wd89 q=esi
compute-47-3 Down  cpu 0:12   load  jobnum=4803700 jobname=3_CdSe-3_pw91ecp   user=wd89 q=esi
compute-47-4 Down  cpu 0:12   load   jobname=3_CdSe-3_pw91ecp   user=wd89 q=esi
compute-47-7 Down  cpu 0:12   load   jobname=3_CdSe-3_pw91ecp   user=wd89 q=esi
compute-48-2 Down  cpu 0:12   load  jobnum=4803491 jobname=20_41_TS_qm   user=ma583 q=esi
compute-48-4 Down  cpu 0:12   load   jobname=20_41_TS_qm   user=ma583 q=esi
compute-48-5 Down  cpu 0:12   load   jobname=20_41_TS_qm   user=ma583 q=esi
compute-48-7 Down  cpu 0:12   load   jobname=20_41_TS_qm   user=ma583 q=esi
compute-49-3 Down  cpu 0:12   load  jobnum=4803027 jobname=..._Thiophene.sh   user=br287 q=esi
compute-49-6 Down  cpu 0:12   load   jobname=CdSe270-2   user=wd89 q=esi
compute-49-8 Down  cpu 0:12   load  jobnum=4803091 jobname=..._sec_D61A____   user=ma583 q=esi
compute-50-2 Down  cpu 0:12   load   jobname=..._sec_D61A____   user=ma583 q=esi
compute-50-3 Down  cpu 0:12   load   jobname=..._sec_D61A____   user=ma583 q=esi
compute-50-4 Down  cpu 0:12   load   jobname=..._sec_D61A____   user=ma583 q=esi
compute-50-6 Down  cpu 0:12   load  jobnum=4803581 jobname=...e_Opt_BS1.pbs   user=ky254 q=esi
compute-50-7 Down  cpu 0:12   load  jobnum=4803581 jobname=...e_Opt_BS1.pbs   user=ky254 q=esi
compute-50-8 Down  cpu 0:12   load  jobnum=4803189 jobname=..._Part_Opt.pbs   user=ky254 q=esi
compute-51-1 Down  cpu 0:12   load  jobnum=4797311 jobname=CdSe270   user=wd89 q=esi
compute-51-3 Down  cpu 0:12   load  jobnum=4803326 jobname=...qc_restart.sh   user=br287 q=esi
compute-51-8 Down  cpu 0:12   load  jobnum=4797430 jobname=CdSe270-2   user=wd89 q=esi
compute-52-1 Down  cpu 0:12   load  jobnum=4797430 jobname=CdSe270-2   user=wd89 q=esi
compute-52-3 Down  cpu 0:12   load   jobname=CdSe270-2   user=wd89 q=esi
compute-53-1 Down  cpu 0:12   load  jobnum=4797430 jobname=CdSe270-2   user=wd89 q=esi
compute-53-2 Down  cpu 0:12   load  jobnum=4800703 jobname=...y2_opt_xqc.sh   user=br287 q=esi
compute-53-3 Down  cpu 0:12   load   jobname=...y2_opt_xqc.sh   user=br287 q=esi
compute-53-4 Down  cpu 0:12   load  jobnum=4803305 jobname=...A__imp_41____   user=ma583 q=esi
compute-53-5 Down  cpu 0:12   load  jobnum=4800641 jobname=corr   user=jh943 q=esi
compute-53-6 Down  cpu 0:12   load  jobnum=4797430 jobname=CdSe270-2   user=wd89 q=esi
compute-53-7 Down  cpu 0:12   load  jobnum=4800639 jobname=corr   user=jh943 q=esi
compute-53-8 Down  cpu 0:12   load   jobname=corr   user=jh943 q=esi
compute-37-2 Down  cpu 0:8   load  jobnum=4802733 jobname=...12-part52.txt   user=fk65 q=fas_normal
compute-37-3 Down  cpu 0:8   load  jobnum=4802539 jobname=...0.160-1850-dp   user=pd283 q=fas_normal
compute-37-4 Down  cpu 0:8   load  jobnum=4802504 jobname=opp.2x10x1.1   user=md599 q=fas_normal
compute-37-9 Down  cpu 0:8   load  jobnum=4801872 jobname=...hfb_dlnz_long   user=cng8 q=fas_very_long
compute-37-10 Down  cpu 0:8   load  jobnum=4788165 jobname=...cavity_009.sh   user=awc24 q=fas_very_long
qstat: Unknown Job Id Error 4799449.rocks.omega.hpc.yale.internal
compute-37-14 Down  cpu 0:8   load  jobnum=4802534 jobname=...0.270-1920-dp   user=pd283 q=fas_normal
compute-37-15 Down  cpu 0:8   load  jobnum=4802345 jobname=SiLTO.12si.0.5O   user=ak688 q=fas_normal
compute-38-6 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-38-7 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-38-8 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-38-9 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-38-10 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-38-11 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-38-13 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-38-14 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-38-15 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-44-3 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-44-4 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-44-8 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-44-10 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-44-12 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-44-13 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-44-14 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
compute-44-16 Down  cpu 0:8   load   jobname=opp.2x12x1.2   user=md599 q=fas_normal
qstat: Unknown Job Id Error 4799449.rocks.omega.hpc.yale.internal
compute-15-7 Down  cpu 0:8   load  jobnum=4802287 jobname=...tion_31111_17   user=jmm357 q=fas_normal
qstat: Unknown Job Id Error 4799449.rocks.omega.hpc.yale.internal
qstat: Unknown Job Id Error 4799449.rocks.omega.hpc.yale.internal
qstat: Unknown Job Id Error 4799449.rocks.omega.hpc.yale.internal
qstat: Unknown Job Id Error 4799449.rocks.omega.hpc.yale.internal
qstat: Unknown Job Id Error 4799449.rocks.omega.hpc.yale.internal
compute-25-2 Down  cpu 0:8   load   jobname=...0.160-1850-dp   user=pd283 q=fas_normal
compute-25-7 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-25-8 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-25-11 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-25-12 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-25-13 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-26-7 Down  cpu 0:8   load   jobname=7-14-15-aspect19   user=krv8 q=fas_normal
compute-26-8 Down  cpu 0:8   load  jobnum=4802318 jobname=7-14-15-aspect19   user=krv8 q=fas_normal
compute-26-9 Down  cpu 0:8   load  jobnum=4802345 jobname=SiLTO.12si.0.5O   user=ak688 q=fas_normal
compute-26-15 Down  cpu 0:8   load  jobnum=4802235 jobname=H.2x2.neg.PTO   user=ak688 q=fas_normal
compute-27-1 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-2 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-3 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-4 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-5 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-6 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-8 Down  cpu 0:8   load   jobname=openX_1.0_1   user=mw564 q=fas_long
compute-27-9 Down  cpu 0:8   load   jobname=openX_1.0_1   user=mw564 q=fas_long
compute-27-11 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-12 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-14 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-15 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-27-16 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-28-3 Down  cpu 0:8   load   jobname=7-14-15-aspect18   user=krv8 q=fas_normal
compute-28-5 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-28-6 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-28-8 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-28-10 Down  cpu 0:8   load  jobnum=4802235 jobname=H.2x2.neg.PTO   user=ak688 q=fas_normal
compute-28-15 Down  cpu 0:8   load   jobname=openX_0.35_2   user=mw564 q=fas_long
compute-29-1 Down  cpu 0:8   load  jobnum=4800438 jobname=openX_0.35_2   user=mw564 q=fas_long
compute-29-7 Down  cpu 0:8   load   jobname=7-14-15-aspect9   user=krv8 q=fas_normal
compute-29-9 Down  cpu 0:8   load   jobname=7-14-15-aspect9   user=krv8 q=fas_normal
compute-29-10 Down  cpu 0:8   load   jobname=7-14-15-aspect9   user=krv8 q=fas_normal
compute-29-12 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-29-15 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-29-16 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-30-6 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-30-7 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-30-10 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-30-11 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-30-15 Down  cpu 0:8   load   jobname=7-14-15-aspect9   user=krv8 q=fas_normal
compute-30-16 Down  cpu 0:8   load   jobname=7-14-15-aspect9   user=krv8 q=fas_normal
compute-31-1 Down  cpu 0:8   load  jobnum=4802235 jobname=H.2x2.neg.PTO   user=ak688 q=fas_normal
compute-31-2 Down  cpu 0:8   load   jobname=H.2x2.neg.PTO   user=ak688 q=fas_normal
compute-31-4 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-31-5 Down  cpu 0:8   load  jobnum=4802379 jobname=SiSBTOLT.5mlO.2   user=ak688 q=fas_normal
compute-31-7 Down  cpu 0:8   load  jobnum=4802379 jobname=SiSBTOLT.5mlO.2   user=ak688 q=fas_normal
compute-31-9 Down  cpu 0:8   load   jobname=SiSBTOLT.5mlO.2   user=ak688 q=fas_normal
compute-31-10 Down  cpu 0:8   load   jobname=SiSBTOLT.5mlO.2   user=ak688 q=fas_normal
compute-31-11 Down  cpu 0:8   load  jobnum=4803002 jobname=...symP_CBS-APNO   user=vaccaro q=fas_normal
compute-31-12 Down  cpu 0:8   load  jobnum=4803001 jobname=...symP_CBS-APNO   user=vaccaro q=fas_normal
compute-31-14 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-31-15 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-32-4 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-32-5 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-32-6 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-32-10 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-32-11 Down  cpu 0:8   load  jobnum=4803083 jobname=plrs   user=olz3 q=fas_very_long
compute-32-12 Down  cpu 0:8   load   jobname=plrs   user=olz3 q=fas_very_long
compute-32-13 Down  cpu 0:8   load   jobname=plrs   user=olz3 q=fas_very_long
compute-32-15 Down  cpu 0:8   load  jobnum=4802235 jobname=H.2x2.neg.PTO   user=ak688 q=fas_normal
compute-32-16 Down  cpu 0:8   load  jobnum=4802235 jobname=H.2x2.neg.PTO   user=ak688 q=fas_normal
compute-33-3 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-33-4 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-33-6 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-33-7 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-33-9 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-33-10 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-33-12 Down  cpu 0:8   load   jobname=SimpleQueue   user=sd566 q=fas_normal
compute-33-13 Down  cpu 0:8   load  jobnum=4802777 jobname=...12-part92.txt   user=fk65 q=fas_normal
compute-33-14 Down  cpu 0:8   load  jobnum=4802318 jobname=7-14-15-aspect19   user=krv8 q=fas_normal
compute-33-15 Down  cpu 0:8   load  jobnum=4802318 jobname=7-14-15-aspect19   user=krv8 q=fas_normal
compute-33-16 Down  cpu 0:8   load  jobnum=4802776 jobname=...12-part91.txt   user=fk65 q=fas_normal
compute-34-1 Down  cpu 0:8   load  jobnum=4800448 jobname=openX_1.0_4   user=mw564 q=fas_long
compute-34-2 Down  cpu 0:8   load  jobnum=4800448 jobname=openX_1.0_4   user=mw564 q=fas_long
compute-34-3 Down  cpu 0:8   load  jobnum=4800448 jobname=openX_1.0_4   user=mw564 q=fas_long
compute-34-5 Down  cpu 0:8   load   jobname=7-14-15-aspect18   user=krv8 q=fas_normal
compute-34-8 Down  cpu 0:8   load  jobnum=4802534 jobname=...0.270-1920-dp   user=pd283 q=fas_normal
compute-34-9 Down  cpu 0:8   load  jobnum=4802534 jobname=...0.270-1920-dp   user=pd283 q=fas_normal
compute-34-12 Down  cpu 0:8   load  jobnum=4780106 jobname=...sd-apVDZ_freq   user=vaccaro q=fas_very_long
compute-39-13 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-40-4 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-40-5 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-40-6 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-40-8 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-40-10 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-40-13 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-41-6 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-41-7 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-41-8 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-41-10 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-41-11 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-41-15 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-41-16 Down  cpu 0:8   load  jobnum=4802043 jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-42-1 Down  cpu 0:8   load   jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-42-2 Down  cpu 0:8   load   jobname=...0_CSF_256_512   user=jbb83 q=astro_prod
compute-42-4 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-42-5 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-42-8 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-42-9 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-42-10 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-42-12 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-42-13 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-42-15 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-1 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-2 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-4 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-5 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-6 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-8 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-11 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-13 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-14 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod
compute-43-15 Down  cpu 0:8   load   jobname=L500_CSF_SFoff   user=etl28 q=astro_prod