From cmclaughlin at fsu.edu Fri Sep 25 11:57:23 2020 From: cmclaughlin at fsu.edu (Casey Mc Laughlin) Date: Fri, 25 Sep 2020 15:57:23 +0000 Subject: [Hpc-notice] Slurm scheduler issues Message-ID: Hi HPC Users, This morning, our Systems Team made an update to the job scheduler (Slurm) in order to fix an ongoing issue we've been having with authentication. This change affected job submissions and most jobs that were already running. If you had any jobs that were pending or running as of this morning, we advise you to login, check on them, and re-submit them if necessary. Additionally, you may see an error message similar to the one below when you attempt to submit your job to the scheduler. In this case, you should wait a few moments and try to resubmit: srun: job 4603 queued and waiting for resources srun: error: Security violation, slurm message from uid 309 srun: error: Security violation, slurm message from uid 309 srun: error: Job allocation 4603 has been revoked We apologize for any disruptions this may have caused to you and your research. Let us know if you need any assistance by submitting a support ticket: support at rcc.fsu.edu. Best regards, The RCC Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From pvandermark at fsu.edu Sat Sep 26 09:19:10 2020 From: pvandermark at fsu.edu (Paul Van Der Mark) Date: Sat, 26 Sep 2020 13:19:10 +0000 Subject: [Hpc-notice] Several compute nodes down Message-ID: Dear RCC Users, We are having issues with several compute nodes being down. This affects many of our partitions and when you submit jobs, you might get an error message about drained nodes. We are looking into this issue. Best, The RCC Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From pvandermark at fsu.edu Sat Sep 26 16:34:34 2020 From: pvandermark at fsu.edu (Paul Van Der Mark) Date: Sat, 26 Sep 2020 20:34:34 +0000 Subject: [Hpc-notice] Several compute nodes down In-Reply-To: References: Message-ID: Dear RCC Users, Most issues have been resolved and things have stabilized over the last few hours. Best, Paul ________________________________ From: Paul Van Der Mark Sent: Saturday, September 26, 2020 9:19 AM To: JESfwd-hpc-notice Subject: Several compute nodes down Dear RCC Users, We are having issues with several compute nodes being down. This affects many of our partitions and when you submit jobs, you might get an error message about drained nodes. We are looking into this issue. Best, The RCC Team -------------- next part -------------- An HTML attachment was scrubbed... URL: