[Hpc-notice] Partial HPC cluster failure - Nodes back online.

Casey Mc Laughlin cmclaughlin at fsu.edu
Wed Oct 2 16:04:28 EDT 2019


Hi RCC Users,

All of the affected nodes (see list below) are back online and operational.  Unfortunately, due to the nature of the problem, all jobs running on the affected nodes were killed.

We apologize for the inconvenience, and if we can do anything, please let us know (support at rcc.fsu.edu).

List of affected racks:

  1.  M32
  2.  I29
  3.  I30
  4.  I31
  5.  I32
  6.  I35
  7.  I36

List of affected partitions:

  *   backfill
  *   backfill2
  *   changlani_q
  *   coaps18_q
  *   eoas19_q
  *   fraser_q
  *   genacc_q
  *   hongli_q
  *   ktaylor_q
  *   mecfd18_q
  *   medicine_q
  *   quicktest
  *   rcc_internal
  *   sec4m_q
  *   stagg_q
  *   stata_q
  *   stroupe_q
  *   yin19_q

Best regards,
The RCC Team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fsu.edu/pipermail/hpc-notice/attachments/20191002/21d15633/attachment.html>


More information about the Hpc-notice mailing list