On Tuesday, February 28th, Amazon Web Services (AWS) had a service disruption that affected its Simple Storage Service (S3) which supports over 150,000 websites.
According to the NPR and the Wall Street Journal, S&P 500 companies lost an estimated $150M during the outage. In addition, 54 of the internet’s top 100 retailers saw website performance slow by 20% or more.
Amazon explained that the outage happened when an engineer was working on debugging AWS’ billing system. They were supposed to take a handful of servers offline to inspect them, but a small typo ended up causing a lot more servers be taken off than intended. To fix the problem, they had to force a restart of the system, which only took a few hours… just over four hours to be exact.
In those four hours, the internet freaked out.
Some laughed at the irony…
While others seemed to be a bit more shook up…
Some even stated it “destroyed the internet”. It was scary to see how little control people seemed to have over their cloud applications. The most they could do was wait it out and try not to panic.
The truth: It can happen to anyone
Whether you are a cloud giant or a new startup, things can happen. All things considered, Amazon came back pretty quickly. A developer that just received series A or B funding isn’t going to have as quick of a comeback as one of the top cloud service providers.
Be smart: Don’t put all your eggs in one basket
Experts say that smart companies won’t put all their eggs in one basket and should make sure they are doing more than just investing in the provider’s redundancy plan. Especially because many cloud providers will have cascading systems like AWS, which has a domino effect, where one issue leads to another.
In many cases, smaller developers can’t afford to have an outage. In fact, outages could lead to going completely under. If that happens, a redundancy plan isn’t going to be useful since users won’t be able to access anything once the cloud provider goes dark.
Imagine this scenario. Startup Ziggeo had a major release on the day of the AWS outage and things did not exactly go as planned for them. They share the lessons they learned, including re-architecting the system for more redundancy.
Take back control: Bring the power back to your hands
If your cloud application is critical, waiting around for the problem to get fixed is not something you can afford to do. Especially if it isn’t coming back. If your cloud provider were to have a disruption would you be running smoothly or would you be in the fetal position?
Iron Mountain can help you be prepared for when the unexpected hits. Our range of SaaSProtect® solutions gives you different options depending on how quickly you need to be up and running.