Downtime for Hardware Failure - ALL TCHPC Clusters
Update 2012-07-19 2pm: Service restored
Logins are available again on the lonsdale, parsons and kelvin clusters.
The home and projects partitions are available. Access to gscratch will be restored shortly.
The queues will be resumed shortly.
TCHPC Cluster Downtime
Update 2012-07-12 9.00am: We expect to have replacement hardware onsite today. The clusters (lonsdale, parsons, kelvin) will be unavailable for approximately one week. PLEASE NOTE: As part of the software stack upgrade, ALL queued jobs will be lost. This is unfortunately unavoidable, as the software version includes some changes to the job save state files.
Update 2012-07-10 3.30pm: Due to a hardware failure, the systems will now be taken offline sooner than expected. The queues are currently being drained, and access is still available to /home, however this will have to be turned off to replace the failed controller. Further updates will be available when we have an ETA on the replacement hardware.
This downtime window is required for essential system maintenance and upgrades on the clusters and filesystems.
All queues will be unavailable at this time.
The GPFS cluster filesystems (
/gscratch) will also be unavailable during this period.
For queries, please contact: email@example.com
More like this
- [RESOLVED] Downtime for Hardware Failure - ALL TCHPC Clusters
- Downtime for Service Expansion (20-23rd Dec 2010) - ALL TCHPC Clusters
- Downtime for Service Window (27th June - 5th July 2011) - ALL TCHPC Clusters and Services
- Cluster downtime - RESOLVED
- Downtime for Service Window (5th-6th Apr 2011) - ALL TCHPC Clusters