Systems Recipes - Issue #4
Hello there 👋 We hope y'all are keeping safe. Things were a little rough and hence we had to take a break, but now we’re back with our regular dose of architecture goodies 😊
If you have read or watched anything interesting lately that you think will be a good fit for future issues or have any feedback, simply hit reply. You can also reach out on Twitter or send an email! 📧
Let’s Encrypt: an automated certificate authority to encrypt the entire web
Let’s Encrypt is on a mission to make website and application certificate generation easier and much more automated. Have you wondered about nitty-gritty of their architecture? The blog post by This Morning Paper takes us through a quick tour of that if you don’t want read through the entire paper.
This paper tells the story of Let’s Encrypt, from it’s early beginnings in 2012/13 all the way to becoming the world’s largest HTTPS Certificate Authority (CA) today – accounting for more currently valid certificates than all other browser-trusted CAs combined. Beyond the functionality that Let’s Encrypt provides, the story stands out to me for two key ingredients. Firstly, whereas normally we trade-off between security and ease-of-use, Let’s Encrypt made the web more secure through ease-of-use. Secondly, Let’s Encrypt managed to find a sustainable funding model for a combination of an open source project and free online service, as compared to the more normal pattern which sadly seems to involve running a small number of beneficent maintainers into the ground.
Exactly-once delivery is a controversial and vehemently debated topic within the distributed systems community as it requires you to consider all possible failure cases which is non-trivial to do in any system of a reasonable size.
In this post, Segment has tried to outline how they use Kafka and RocksDB to conduct de-duplication for their data pipelines to get as close to exactly-once delivery as they can.
Three months, 30x demand: How we scaled Google Meet during COVID-19
This is a really interesting read about how Google scaled Meet to meet the additional demand imposed by the Coronavirus crisis in a remote-first world. They kicked off by declaring a pre-emptive incident and coordinated a response to make sure there was enough capacity.
Capacity planning can sometimes be a game of guess work. No matter what happens, you want levers to pull to avoid issues and outages. Some of the team worked on adding toggles that could be flipped in an emergency (such as downscaling video quality if bandwidth was saturated in a region).
Learn how Google Cloud ramped up to handle high demand for Google Meet in light of COVID-19
Mesh: a cloud native service mesh for the rest of us
Another day, another service mesh offering. Have you gotten a chance to check out the newest kid on the block in the service mesh world? 👀
Cloud native service mesh for the rest of us.