Chaos Engineering Stories

By Laurent Domb

This repository provides various public stories on how organizations approach and implement Chaos Engineering. If you have a story to share, please create a pull request!

Financial Services

Ally
Chaos Testing improving system resilience
Apollo / Yahoo
Chaos engineering in a production environment practiced by Yahoo!
Bloomberg
Chaos Engineering: A 5 year Retrospective
Chaos Engineering with PowerfulSeal
Supercharge your SRE teams with Chaos Engineering
Capital One
How Capital one performs Chaos Engineering in Production
AWS re:Invent 2022 - Benefiting from chaos engineering at Capital One (PRT026)
Continuous Chaos — Introducing Chaos Engineering into DevOps Practices
5 Steps to Getting Your App Chaos Ready
Embrace the Chaos Engineering
3 Lessions learned from implementing Chaos Engineering
DBS bank limited
How DBS dispelled the myths of Chaos Engineering
How DBS dispelled the myths of Chaos Engineering Video
DTCC
The Power of technology resilience
Resilience White Paper
Fidelity
Fidelity Investments: Building for mission-critical resilience
Fidelity reveals “chaos buffet” approach, says 7,700 apps now on public cloud
Goldman Sachs
Chaos Testing an Application on AWS
Beyond the Monkeys Chaos Engineering on the Cloud - Bella Wiseman & Sindhuja Durai, Goldman Sachs
Intuit
Intuit Runs Gameday Simulations to Test Resilience of Critical Business Systems and Apps at Scale
Testing reliability of our next-gen platform on Kubernetes Keiko
Automating Resiliency: How To Remain Calm In The Midst Of Chaos
Itau Brazil
Chaos Engineering and Observability
A real-world resilience evolution in the cloud framework (ARC309)
JPMC
Don’t ignore Chaos testing!
Chaos engineering @ J.P. Morgan Chase - Garima Singh & Deepak Sarda
kallisti
Kount / Equifax
Chaos Engineering at Kount
London Stock Exchange Group
London Stock Exchange Group uses chaos engineering on AWS to improve resilience
Nationwide Building Society
Automating and Scaling Chaos Engineering using AWS Fault Injection Simulator
National Australia Bank
Observability in the realm of Chaos Engineering
Pismo
How Pismo adopted Chaos Engineering
Rabo Bank
Unleash that Chaos
Santander Bank
The art of introducing intentional failures
Starling Bank
The Abyss of Ignorable: a Route into Chaos Testing from Starling Bank
Stripe
How Our Security Requirements Turned Us into Accidental Chaos Engineers
TD Bank
Continuous Stability Engineering - Minimizing Chaos
Continuous Stability Engineering Video
Vanguard
Cloudy with a Chance of Chaos
How Chaos Engineering works at Vanguard
ZaloPay
Chaos Engineering at ZaloPay

Health Care

Main Line Health
Main Line Health deploys chaos engineering to bolster healthcare resilience
Cerner
Applying Chaos Engineering in Healthcare: Getting Started with Sensitive Workloads
Building confidence in healthcare through chaos engineering
Cardinal Health
Security Chaos Engineering From Theory to Practice
HaloDoc
Applying Chaos Engineering @HaloDoc

Insurance

Express Scripts
Chaos and Resilience Engineering - My Journey
StateFarm
Embracing Chaos

Media and Entertainment

Audible
Chaos Engineering and Scalability at Audible.com
Amazon Prime
Practice like you play: How Amazon scales resilience to new heights
From Chaos To Resilience
Building resilent systems at Amazon Prime Video
Disney + Hotstar
Disney + Hotstar and their tale with scalability
CONDÉ NAST
how Condé Nast practices Chaos Engineering - Growing Resilience: Serving Half a Billion Users Monthly at Condé Nast
How Condé Nast Succeeds by Buildling a Culture that Embraces Failure
Netflix
Chaos Engineering Blog
Slack
Disasterpiece Theater: Slack’s process for approachable Chaos Engineering
Twitch
Chaos Engineering at Twitch

News

Financial Times
A Chaos test too far

Networking & Telecommunications

Cisco
Adoption of Chaos Engineering in Webex Contact Center
Orange
Validating the resiliency of Orange’s private telco.
T-mobile
The Start of Chaos Engineering at T-Mobile
Monarch: App-level Chaos Engineering
Twilio
Chaos Engineering at Twilio
Discovering Issues with HTTP/2 via Chaos Testing
Chaos Gamedays

Cloud Providers

Amazon Web Services
AWS Lambda - Resilience-under-the-hood
How Amazon.com Search Uses Chaos Engineering to Handle Over 84K Requests Per Second
Any Day Can Be Prime Day: How Amazon.com Search Uses Chaos Engineering to Handle Over 84K Requests Per Second

Solution Providers

Honeycomb.io
Deploy on Friday? How about destroy on Friday!

Retail

Etsy
Fault Injection in Production

REI Co-op
REI’s Checkout microservice - An introduction to Chaos Engineering at REI

Target
Ɔhaos Ǝnginǝǝring @ Target - Part 1
Ɔhaos Ǝnginǝǝring @ Target - Part 2
Walmart
Chaos Engineering- Chaos Toolkit- S1E2- Pod Kill
Chaos Engineering- Chaos Toolkit- S1E3- Toxiproxy
Chaos Engineering- Chaos Toolkit- S1E4- Chaos Monkey
Charting a path to software resiliency
Resiliency Doctor — A tool to achieve resiliency in hybrid cloud application ecosystems

e-commerce

Ebay
Ebay explores chaos fault testing at the application level

Transport / Airlines

Alaska Airlines
When Observability is Good for Chaos
From Chaos to Optimization: Alaska Airlines’ Observability Journey
Lyft
Chaos Experimentation
Uber
Chaos Engineering @ Uber

Human Resources

Lifion by ADP
Chaos is good in tech

Social Media

Linkedin
A Request-Level Failure Injection Framework

FoodDelivery

Door Dash
Using Fault Injection Testing to Improve DoorDash Reliability
iFood
Chaos Engineering @ iFood
Just Eat
Causing Chaos

Hospitality

Expedia
Chaos Engineering at Scale
Automating Chaos Attacks
How to perform optimized regression tests
Chaos Engineering at Expedia
VRBO, Hotels.com
London Chaos and Resilience Engineering Community

Printing/Graphics/Publishing

Adobe
How we were building resilience in adobe experience platform

Gaming

Interactive Entertainment Group
Secure online gameing through chaos engineering

NetEase Fuxi AI
How NetEase Fuxi AI Lab is using chaos engineering to improve testing

Riot Games
Controlled Chaos with fault injection testing

ERP

SAP
Security Chaos Engineering and Security Engineering Amid Chaos: Cloud-native Cyber Resilience
Handling chaos and complexity when building and running enterprise software systems

Fashion

Under Armour
Breaking things on purpose
Measuring Chaos

Higher Education

How Pearson improves its resilience with AWS Fault Injection Service

Government use of Chaos Engineering

DoD
DoD Enterprise DevSecOps Community of Practice
DoD MIT Chaos Engineering

Air Force
How Chaos Engineering Transforms Cybersecurity for the Air Force
Chaos Engineering US-AirForce
Ministry of Justice UK
Cloud Platform Chaos