bgentry2 hours ago
The important quote from the timeline:
Mar 01 9:41 AM PST
We want to provide some additional information on the power issue in a single Availability Zone in the ME-CENTRAL-1 Region. At around 4:30 AM PST, one of our Availability Zones (mec1-az2) was impacted by objects that struck the data center, creating sparks and fire. The fire department shut off power to the facility and generators as they worked to put out the fire. We are still awaiting permission to turn the power back on, and once we have, we will ensure we restore power and connectivity safely. It will take several hours to restore connectivity to the impacted AZ. The other AZs in the region are functioning normally.
jiggawatts19 minutes ago
This reminds me of a visit to an Equinix data centre where the sales person was droning on and on about how incredibly reliable their power supplies were, how uninterruptible everything was, etc, etc…
Essentially, he was trying to assure us that no-no-no, we don’t need multiple zones like the public clouds, they can instead guarantee 100% uninterrupted power under all circumstances.
A bit bored and annoyed, I pointed to the giant red button conspicuously placed in the middle of a pillar and asked what it is for.
“Oh, that’s in case there’s a fire!”
“What does it do?”
“It cuts… the power… uhh… for the safety of the fire department.”
“So… if there’s a wisp of smoke in a corner somewhere, the fireys turn up, the first thing they do is… cut the power?”
“… yes.”
“Not 100% then, is it?”
Imustaskforhelpan hour ago
> we will ensure we restore power and connectivity safely
this would require human intervention and I am a bit worried what if the strike can happen again and human lives might be lost.
IIRC there have been cases in history where sometimes a same location is targeted across multiple days. Obviously, AWS might have local employees working in the region but would there be an evaluation of this threat itself within the relevant team in AWS. What if they try to bring the service back but then missiles are struck again and what if human lives might be lost on it. Let's just hope that it could be part of a evaluation as well.
tokyobreakfast24 minutes ago
> this would require human intervention
that's the difference between heroes and ordinary employees who bitch about having to go into the office twice a month.
same as the stories you hear of guys taking snow-cats up a mountain in a blizzard to restore phone circuits or radio transmitters gone offline.
flymasterv15 minutes ago
Man, don’t be a “hero” trying to restore a lower ping to someone trying to buy a kindle in Jeddah.
ok_dad11 minutes ago
What about local hospitals which may have service from that data center? There are heroes needed everywhere, all the time.
thatguy090011 minutes ago
I'm sure bezos will be really happy someone is being a hero for him in a war zone while he sails his newest yacht to wherever the new version of the island is.
p-o2 hours ago
Interesting adjacent theory is how much are datacenters becoming military target to strike as part of disrupting initial defenses. It doesn't seem it was the case in this instance, but I could see this becoming a more important target in future.
Seems like it should be somewhat easier to bomb 50 datacenters than it would be to hack and disrupt 1000s of different services.
Again, this is just me thinking out loud on a tangent and this doesn't have much to do with this story, but I felt it was an interesting thought to share nonetheless.
swiftcoderan hour ago
The more interesting question, is how many datacenters are just plonked next to a high-value military target?
For infrastructure reasons, we plonk datacenters down next to airports big enough to fly major hardware into, and near where the big oceanic cables come ashore… and for strategic reasons those are also the perfect places to place military bases
throw47578739 minutes ago
Is there acrually some meaningful physical separation between military and civilian server deployments?
We seem to be really bad at separating those two. For example Starlink is basically military infrastructure now, used to guide bombs.
cherryteastain18 minutes ago
A datacenter IS a high value military target.
roncesvallesan hour ago
Exactly. 2 is only sufficient for HA against random failures. It's not enough for HA against a determined adversary willing to use targeted force.
tbrownaw2 hours ago
> Seems like it should be somewhat easier to nuke 50 datacenters than it would be to hack and disrupt 1000s of different services.
Previous outage news makes it sound like the cloud providers still have quite a few logical single points of failure.
Zeyka2 hours ago
That's so interesting. Are any of the US military (or other satellite state of the US) systems running in "normal" datacenters or do they have a few protected DoD datacenters in the US?
Imustaskforhelpan hour ago
Found this relevant article: https://serverlift.com/blog/military-modular-data-centers/ (AWS Military Modular Data centers)
I do think that though, atleast from the Anthropic decision prior, we know that Anthropic which was used by DoD should be on normal AWS datacenters.
I am saying this because, Dod Threatened to force take the source code of Anthropic if they don't agree to aggregious demands so that means that they don't have the source code.
Perhaps DoD used Anthropic within AWS Military modular DC's but I find it extremely unlikely.
I am almost certain that even with OpenAI who bent its knee to DoD, its still hosted on regular infrastructure and DoD is using these AI models on pretty sensitive tasks (During the Venezeula Maduro's capture, Anthropic/Claude were used iirc to handle some data analysis)
IMO Tho, Any Employee from Anthropic/OpenAI might know better tho about how these models are deployed.
roxolotlan hour ago
This is the data center version of https://xkcd.com/538/. Realistically if there is a hot war what you’re saying seems accurate.
Imustaskforhelp2 hours ago
> Seems like it should be somewhat easier to nuke 50 datacenters than it would be to hack and disrupt 1000s of different services.
The bigger part of me seems that if we someone nukes 50 datacenters all at once or say all of Amazon's datacenters at once, then the data stored in there would simply be gone and given so many datacenters are located in Virginia,USA iirc or just so many companies being reliant on few datacenter providers.
The larger threat to me with the lose of data is firstly the panic within public fronting services but also, with Hedge Funds, Pension funds or banking datacenters who might be using these and if they lose the data, then its gonna cause even more public mayhem.
Some might be saying oh off-site backups exist but there has atleast been one instance, where a single Google accident had led to massive issues for a 135 Billion $ pension fund.
Relevant Kevin Faang video about it: https://www.youtube.com/watch?v=3GOAUyipnM4 [Google Accidentally Deletes $135 Billion Pension Fund, Chaos Ensues]
jcgrillo2 hours ago
IIUC part of the reason ballistic missiles have multiple warheads is that some of them detonate high up to knock out air defenses and other electronics allowing the rest to fall through to their targets. The last time we tried this experiment as a species was the starfish prime tests in 1962 which caused some electrical havoc in Hawaii. These days our systems are probably more delicate and sensitive? All that is to say, in a scenario where nukes are going off I'm not sure you'd even need to target any datacenters in particular.. they're probably all toast by default.
ejdyksen2 hours ago
Just one AZ, not the whole region:
> The other AZs in the region are functioning normally. Customers who were running their applications redundantly across the AZs are not impacted by this event.
anonu2 hours ago
We have business in UAE. For whatever reason I defaulted to us-west-2 since these particular applications are not latency sensitive.
boxedemp2 hours ago
Amazon usually has 3 AVs per region, looks like there are surviving AVs but the system didn't switch over gracefully.
I bet that was an interesting sev2 ticket!
easton2 hours ago
It depends on the service if things move gracefully or not. The incident explains it's only EC2 (and dependent services) in that AZ, so if they try to route traffic for services hosted on EC2 to that AZ it's not working (and customers running instances in that AZ have lost access).
The other ones are not impacted. They always like to tell you to pay for more than one instance in different AZs so if this happens you don't get impacted.
Shank2 hours ago
I wonder if this was bad targeting job or intentional. I appreciate the transparency and optimism in the status updates though!
sb0572 hours ago
Looking at Google Maps, there's Al Dhafra Air Base a couple of miles to the datacenter's south, an oil refinery a bit to the east, ports to the north, and a military academy to the west.
eptcyka2 hours ago
> one of our Availability Zones (mec1-az2) was impacted by objects that struck the data center, creating sparks and fire.
God forbid we'd ever say that it was struck by a missile or a munition in an act of war.
NikolaNovak2 hours ago
That's what I'm trying to understand too. It's this a meteor,tree,etc? Or a human made object,and if so accidental or intentional one. Further risk assessment would be dependent on root cause.
rrvidqdian hour ago
The earliest phrasing I saw internally was "Root cause is identified as a drone attack to DXB61 site". That's somewhat open to interpretation, and could also have simply been incorrect. It was scrubbed from the ticket, though, and it now merely vaguely gestures toward a "power event". The ticket I'd expect to have further detail was locked down.
hdgvhicv2 hours ago
Maybe a missile, maybe a drone, maybe debris
Doesn’t really matter, we know trumps latest war is the cause
arjiean hour ago
I actually like the way they said it. I don't know if it's a different cultural tradition, but the cool steely-eyed fact-based conversation always really felt so much more inspiring:
Conrad: I got three fuel cell lights, an AC bus light, a fuel cell disconnect, AC bus overload 1 and 2, Main Bus A and B out.
Aaron: Flight, EECOM. Try SCE to Aux.
Modern culture in the movies and whatnot is that someone should be yelling "Everything's failing. Give me something, Houston. All lights are on! MAYDAY MAYDAY!" and some sort of flavour commentary like that. But reading engineering updates that go like this feels like watching maximal professionalism under fire:> At around 4:30 AM PST, one of our Availability Zones (mec1-az2) was impacted by objects that struck the data center, creating sparks and fire. The fire department shut off power to the facility and generators as they worked to put out the fire. We are still awaiting permission to turn the power back on, and once we have, we will ensure we restore power and connectivity safely. It will take several hours to restore connectivity to the impacted AZ. The other AZs in the region are functioning normally. Customers who were running their applications redundantly across the AZs are not impacted by this event. EC2 Instance launches will continue to be impaired in the impacted AZ. We recommend that customers continue to retry any failed API requests. If immediate recovery of an affected resource (EC2 Instance, EBS Volume, RDS DB Instance, etc.) is required, we recommend restoring from your most recent backup, by launching replacement resources in one of the unaffected zones, or an alternate AWS Region. We will provide an update by 12:30 PM PST, or sooner if we have additional information to share.
This has that same mechanical tone of an ice-cold captain dealing with a proximate situation providing exactly the information they know. No flavour commentary. Amazing. I fucking love it.
jcgrilloan hour ago
High signal/noise ratio is extra important when things are going badly
bigyabai2 hours ago
Potential moon bear attack; we're waiting on satellite imagery to confirm it. https://youtu.be/pvjgIxuVdo4?t=96
potatoproduct2 hours ago
We are living in increasingly weirder times.
astrange2 hours ago
A factory not working because of a missile strike seems pretty classic actually.
guerrilla2 hours ago
Sure, but it's not a factory.
astrange2 hours ago
It's a big building with a lot of capital assets inside that are the means of production for a business…
mediaman2 hours ago
Why not? It's a physical building with lots of equipment that produces products shipped to its customers.
Its products are sequences of electrons, instead of atoms. But so are power plants. And in the context of what happens when they're hit by missiles, a factory, data center, and power plant all behave the same.
debo_2 hours ago
When I first learned that there were AWS Middle East regions, my first thought was "wow they are more optimistic than I am ."
toast02 hours ago
Google Cloud also has middle east locations. As does Azure, Oracle and Alibaba. Afaik, IBM Cloud does not. I think those five and AWS are the top 6 global public access clouds.
Cyph0n2 hours ago
No, they are more aware of the customer demand for compute in the region.
alexfoo2 hours ago
And demand for data sovereignty.
Cyph0n2 hours ago
Absolutely, especially in the KSA.
dgxyz2 hours ago
Not really. It's just been pretty damn quiet for years.
buttermeup2 hours ago
[dead]
Trasmatta2 hours ago
Is this the one in Bahrain?
the_mitsuhiko2 hours ago
UAE. Abu Dhabi
Imustaskforhelp2 hours ago
Has this ever happened ever in history of Cloud providers before this because of war?
They mention that the datacenter had fires and sparks and they are mentioning hours of downtime but given the situation, How does that prevent the situation from happening again. It's best for people to use safer regions than the middle east in the moment as missiles might target the same datacenter seeing that some damage was caused.
Moving forward, will there be a demand (all be small) for nuclear bunker esque datacenters which can withstands missiles? I know absolutely nothing about constructing underground but can explosives not be used to create underground datacenters comparatively cheaply? One can also use revamped Nuclear bunkers (although the scale of AWS datacenters might be huge tho who knows)
Had some ideas which show that this idea might be interesting, https://www.nature.com/articles/s44284-026-00406-2
I am curious but what are the safety attempts made by Internet Exchange Providers or (had to search it up) but Submarine Cable landing stations, to me it feels like blowing these up leads to internet downtime across whole country / between providers.
crotea minute ago
> will there be a demand (all be small) for nuclear bunker esque datacenters
Those already exist. See for example Bahnhof's "Pionen - White Mountain" data center in Stockholm, or Cyberfort's "The Bunker" a bit west of London.
toast0an hour ago
Historically in the US, some portion of Bell installations were designed to be resistant to attack. But it comes at large expense for construction and maintenance. Underground facilities also bring increased risk of flooding.
Competition and deregulation and lack of attacks leads towards less robust installations to reduce costs. Geographically redundant installations help as long as all installations aren't targetted; and are valuable for operational concerns other than just attacks.
tbrownawan hour ago
> but given the situation, How does that prevent the situation from happening again
You don't. Instead, you make sure your failover or DR setup is regularly tested and works.
SoftTalkeran hour ago
Data centers are usually built to withstand local natural risks e.g. weather. All bets, SLAs, and insurance are usually off when it comes to acts of war.
[deleted]an hour agocollapsed
general14652 hours ago
In Southern Europe some smaller web servers are intermittently not working, while big servers like YouTube are working fine. But I don't think it is related.