[Hpc-notice] Archival issue

Paul Van Der Mark pvandermark at fsu.edu
Wed Oct 30 14:45:58 EDT 2019


Dear RCC users,

All archival volumes have been brought back online and the globus
fsurcc#archival endpoint has been reactivated. 

The issue was that a failed drive was being replaced, which is a pretty
standard operation for a raid configuration and usually does not impact
operations. However, because of some very heavy IO on the system, this
reconstruction was interrupted all the time. We are looking at a way to
prevent this type of perfect storm of events. 

Please let us know if you still experience issues with the archival
storage. 

Best,
The RCC Team

On Wed, 2019-10-30 at 12:10 -0400, Paul van der Mark wrote:
> Dear RCC users,
> 
> We have pinpointed the issue with our archival system to some unusual
> IO patterns and we are trying to determine the cause of this. All ZFS
> volumes are currently unmounted and we will bring them back online in
> the coming hours. 
> 
> Best regards,
> The RCC Team
> 
> On Wed, 2019-10-30 at 09:45 -0400, Paul van der Mark wrote:
> > Dear RCC users,
> > 
> > We are currently experiencing some issues with the archival system.
> > The
> > Globus system is fully functioning. The archival system will
> > temporarily not be available through globus, but the Globus service
> > itself is fine.
> > 
> > Best regards,
> > The RCC Team
> > 
> > On Wed, 2019-10-30 at 13:31 +0000, Casey Mc Laughlin via Hpc-notice
> > wrote:
> > > Hi RCC Campus Partners,
> > > 
> > > We are currently experiencing an issue with our Research Archival
> > > Storage System.
> > > 
> > > In order to stabilize the system, we are going to un-mount the
> > > system
> > > from the export nodes and disable the endpoint in Globus.
> > > 
> > > We will post another notice to this list in a few hours or as
> > > soon
> > > as
> > > this issue is resolved.
> > > 
> > > Details: https://fla.st/2otiori
> > > 
> > > Best regards,
> > > The RCC Team
> > > _______________________________________________
> > > You received this message, because you have an account with the
> > > FSU
> > > Research Computing Center
> > > More information: http://rcc.fsu.edu/connect
> > > 
> > > ** More News: http://rcc.fsu.edu/news
> > > ** Facebook:  http://facebook.com/fsurcc
> > > ** Twitter:   http://twitter.com/fsurcc




More information about the Hpc-notice mailing list