5 February 2014
Incidents in the data center, the response of the load bank
A question comes up regularly in our exchanges on the risks of a lack of test during acceptance of a data center. What does a recipe concretely provide via load banks?
A first element of answer lies in a figure: 90 minutes
This is the average annual power failure time of the main network (20,000V arrival) of a data center. This is at least 90 minutes of operation on the UPS and generators. The load bank precisely makes it possible to ensure that this equipment will operate during this period of time. In addition, several incidents & accidents have had unfortunate consequences for users, tests with load banks are one of the means of guarding against these defects:
- Major incident on the DC2 data center of Iliad/Online in Vitry sur Seine : on July 4, 2013, following a power cut at ErdF (20,000V), the generators failed to resume the electrical load of the servers of the Iliad data center (subsidiary of Free). At 12:37 p.m., Online informed its users that 3 of the electrical groups of the DC2 data center in Vitry had not been able to start following a mechanical incident. This interruption is linked to the failure over a period of less than an hour of 3 generators out of the 6 in the data center. This incident resulted in a power outage on one of the branches of the datacenter and resulted in the inability of customers to access their data. Located in Vitry-sur-Seine, the Iliad DC2 data center is spread over 4500m2 of computer rooms. It is now one of the benchmark data centers for many Internet professionals, with more than 500 customers hosted on 1,600 bays in production. Many services and websites were inaccessible for long minutes this afternoon. This was particularly the case for LaPoste.net, Pecheur.com, DoYouBuzz, Deezer, SensCritique, CleverCloud, but also the JDN, among others. (Incident report: http://forum.online.net/index.php?/topic/3332-incident-coupure-salle-103-rapport-dincident/ )
- October 29 and 30, 2012: Storm Sandy causes a power cut in several data centers in New Jersey in the United States, following the floods, efforts to run the data center on generators for 2 days were experienced “as very difficult” by Datagram CEO Alex Reppen. Tests with a load bank had been carried out shortly before and made it possible to avoid a break in the data center power supply. (Source: http://www.datacenterknowledge.com/archives/2012/12/17/the-year-in-downtime-top-10-outages-of-2012/ )
There are many other examples of incidents where load banks have shown their usefulness in preventive maintenance.