https://netflixtechblog.com/
All services available in region so no cross-region requests
Direct to region by location instead of latency to maintain control
Service to direct request: Zuul
Error request at start when downstream won’t be able to handle. Dynamic with scaling.
Test abilities of new services with production load
Writes need to tell all caches to invalidate. Can a new request happen before the data is synced?
Tools to deploy the same code to multiple regions
Multiple levels of chaos: service, availability zone, region, region connection. Some levels will cause total loss of service
Automatic failover
Even though this is 9 years old, it is still rare.
Does the type of Netflix data change make it easier for them? If a customer gets stale data it probably doesn't have a large impact.