[Hpc-notice] Slurm upgrade on HPC nodes Tuesday, February 18 - No disruption or downtime

Casey Mc Laughlin cmclaughlin at fsu.edu
Fri Feb 14 10:50:37 EST 2025


Hi HPC users,

We are planning to upgrade the Slurm scheduler on the nodes in the HPC cluster this coming Tuesday, starting at 9am. This upgrade will be fully transparent to end-users, meaning that you should not experience any interruption in service. Both job submission and already running jobs will continue to operate normally.  This upgrade will not require recompiling of any code.

The upgrade brings our compute nodes up to a more recent version, v24.05. We already upgraded the HPC controller to this version this past November. It also brings the usual bugfixes and performance improvements.

One notable improvement is the Process Management Interface (PMI) library. This is the "glue" between MPI applications and the Slurm job scheduler. We have seen several issues in the past where jobs were not correctly terminated after MPI_Finalize calls that point to the standard and very old PMI libraries on our system. This update fixes some of those issues.

If you have any questions or concerns, please let us know: support at rcc.fsu.edu<mailto:support at rcc.fsu.edu>.

Best regards,
The RCC Team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fsu.edu/pipermail/hpc-notice/attachments/20250214/b4cda134/attachment.html>


More information about the Hpc-notice mailing list