The Amazon’s AWS cloud computing service outage this week was explained


In fact, one of the team members executed Tuesday a command that was meant to take a few of the S3 servers offline. One part of that command was entered incorrectly, possibly it had a typo. As consequence, a larger number of servers to be taken offline, including some which have important systems for the whole East Coast region. A full restart was required and the operation was done in a long time due to data volume to be processed. “S3 has experienced massive growth over the last several years and the process of restarting these services and running the necessary safety checks to validate the integrity of the metadata took longer than expected,.” the company explained. Only after four hours and 17 minutes the entire system was back up and running. Lesson learned. “We are making several changes as a result of this operational event.” To prevent this to occur again, part as the used software was rewritten. Amazon apologized to its customers for the event: “We will do everything we can to learn from this event and use it to improve our availability even further.”


Please enter your comment!
Please enter your name here