Main Content Region

News and Events

Compute nodes

Downtimes, May 19th to 23rd

Submitted by admin on Wed, 05/17/2017 - 12:33

There will be a loss of power to the Lloyd Building Data Centre on Saturday the 20th of May. This is part of planned works for the IT Services Disaster Recovery project.

This loss of power will necessitate the powering off of all equipment in the Lloyd Data Centre on Friday the 19th of May. The equipment will be powered back on Monday the 22nd of May.

We apologise for the inconvenience that this necessary work will cause.

Virtual Infrastruture Downtime, May 16th 2017

In order to carry out a necessary update the storage component of the DRI/ Research IT Open Nebula Virtual Machine Infrastructure will have to be stopped. This storage provides the disks for Virtual Machine's in the infrastructure. Because of this all Virtual Machine's must be powered off to facilitate the update.

This will require a 2 hour downtime between 9am and 11am Tuesday the 16th of May with an at risk period until 1pm. The services impacted include:

Downtime 2017-01-27 College-wide power brown-out

Update: Mon 30th Jan

All core systems which were on UPS were not affected by the outage.

HPC compute systems are now back online.

Friday 27th Jan

There appears to have been a College-wide power failure on the evening of Friday 27th January.

No further details at present.

All TCHPC clusters are offline until this is properly investigated.

Updates will be posted when further information is known.

Downtime for main clusters - UPS issues

Update 3pm, 17th Jan

The faulty UPS unit has been replaced. The systems are back online. We are monitoring the clusters for any further issues, but we expect the matter to be resolved now.

Around 9:30am the UPS backing some core equipment of the main clusters (lonsdale, parsons, kelvin) appears to have had an issue.

Currently the clusters are down because the filesystems are unavailable. We are investigating further.

Updates will be posted here.