Proficloud.io > Archive > AWS Outage June 10 2021

AWS Outage June 10 2021

  • Archive

AWS had a networking and EC2 outage in the eu-central-1 region with also partially impacted our Proficloud services between June 10, ~21:00 UTC until June 11, ~03:40 UTC.

AWS has resolved the issues and Proficloud is back up normally.

PLCnext controls may fail to reconnect and require a reboot due to a firmware issue solved in the next release (2021.6, July 2nd).

From AWS explanation from https://status.aws.amazon.com/#EU_block:

"The root cause of this issue was a failure of a control system which disabled multiple air handlers in the affected Availability Zone. These air handlers move cool air to the servers and equipment, and when they were disabled, ambient temperatures began to rise. Servers and networking equipment in the affected Availability Zone began to power-off when unsafe temperatures were reached. Unfortunately, because this issue impacted several redundant network switches, a larger number of EC2 instances in this single Availability Zone lost network connectivity. While our operators would normally had been able to restore cooling before impact, a fire suppression system activated inside a section of the affected Availability Zone. When this system activates, the data center is evacuated and sealed, and a chemical is dispersed to remove oxygen from the air to extinguish any fire. In order to recover the impacted instances and network equipment, we needed to wait until the fire department was able to inspect the facility. After the fire department determined that there was no fire in the data center and it was safe to return, the building needed to be re-oxygenated before it was safe for engineers to enter the facility and restore the affected networking gear and servers. The fire suppression system that activated remains disabled. This system is designed to require smoke to activate and should not have discharged. This system will remain inactive until we are able to determine what triggered it improperly. In the meantime, alternate fire suppression measures are being used to protect the data center. Once cooling was restored and the servers and network equipment was re-powered, affected instances recovered quickly."

Tags: