As data and applications continue to get larger and faster, sometimes we need to make the data readily available. Depending on the need, we may store, or cache, that data in different ways.

Today, I want to bring you, my readers, together and talk about the concept of extremely fast data access, using caches to back up high traffic APIs and message consumer/producers… to make cash. (Get it? Cache? Cash? Yeah, I did that.)

A primary reason to set up caching outside of your database is to reduce load within your database engine. While scaling is easier than ever in the cloud, it still costs money to scale. Even if you’re using open source databases, you’re still going to pay for compute and storage needs. Caching can help reduce the load, saving cash right off the bat. (See how I tied that in there?)

At Capital One I lead a large group of engineers across an enterprise program that delivers customer’s digital messages. That’s a fancy way of saying email, SMS and push notifications to the major phone platforms. We send an amazing number of messages every day on behalf of a wide range of internal applications. My guidance across the board is, “Cache. Cache all the things.” We process a lot of data and we need as fast of access to this data as possible. Since we have so many applications and data sources, we have different patterns and different data available depending on the process that triggers a message.

Let’s walk through a few examples of caching architectures to assist in building blazing fast message bus-based code or extremely responsive APIs. But first, the key to scaling a data processing engine is to set up queues and pass data asynchronously between applications. A big distinction in the asynchronous pattern would be that it allows us to scale in a very different way than if things were synchronous, or ‘blocking’ from start to finish.

When your code is blocking, you’ll tend to have to scale vertically (bigger machines) and horizontally (more machines). When we go asynchronous, with a bunch of tiny microservices, we’re able to scale horizontally without as much of a need to scale vertically. This is ripe for a lot of modern patterns including containers or serverless functions-as-a-service (FaaS e.g. AWS Lambda).

Follow me here if you want to keep an eye on more details of scaling horizontally with micro services or functions-as-a-service!

The different implementation styles below are different ways to build caches. We’ll look at building a local cache for a single application first. One application uses one local/in memory caching service to process the data it needs to process. Then we’ll scale a little bit more by using an external application, such as Redis, to store our cached data. This can be really useful when scaling a piece of code horizontally. Finally, we’ll look at building a strategically placed, distributed cache that will give advantages to all consuming components.

Let’s dive in.

Local Cache

A local cache is the easiest thing you can build to speed up your application. Simply grab external data, store it locally, and then look it up later using some sort of primary key. This works for applications that are continuously running like a web application or a streaming data platform. There are multiple ways this could work.


My previous article, Blazing Fast Data Lookups in a Microservices World, focused on building a blacklist. We were primarily concerned with the existence of data in a set. We weren’t associating data with a key or using storing data beyond checking whether or not we had a specific value. All we needed to know was if data existed, and based on that we could make a decision.

This is a very common use case, but we can do more to add value in our application by solving other problems. As developers, we would typically associate that kind of ‘question’ to a boolean data type. Some examples of when we could use a data structure like the trie we built in the last article.