Multilayer Caching in .NET - Turnerj (aka. James Turner)

Caching is a powerful tool in a programmer's toolbox but it isn't magic. It can help scale an application to a vast number of users or it can be the thing dragging down your application. Layered caching is a technique of stacking different types of cache on top of each other which play to different strengths.

I was first inspired to the idea of multilayered caching by Nick Craver. He wrote a great article about how Stack Overflow do caching which has a lot of interesting insights - definitely worth checking out if you haven't already. It was his article that inspired me to create Cache Tower, my own multilayered caching solution for .NET with an emphasis on performance.

Using the example he illustrated in his post, our own computers already do multiple layers of caching:

L1/L2 CPU Cache
RAM
SSD/HDD (Pagefile)

The performance profiles of each of these is drastically different where the CPU caches are the fastest but also hold the least amount of data. This is probably the first important takeaway from caching - its not just what you cache, its how you cache it.

There is an interesting case with Cloudflare where they put unpopular items in the RAM and more popular items into their SSD storage. They use a multilayered cache system of RAM then SSD. While they have some extremely fast SSDs, it turns out when you read and write to them at the same time, you can suffer a performance penalty. To avoid that penalty, they realised that having unpopular items (items never hit or hit only once) purely in the RAM allowed their overall system to perform better. It may not be perfect but they got some interesting results!

Looking at caching from an application's point of view, the layers may look a bit different but the concept is still the same. We move from the fastest layers which have limited space to slower layers which have more space.

In-Memory Cache
Redis/Memcached
Database/File

While it might seem simple enough to implement yourself, there are a few considerations to keep in mind for building a scalable multilayered caching solution.

Keeping Cache Layers Up-to-Date

Scenario: You have multiple instances of an application with their own local caches (in-memory) while also having a shared cache (Redis).

Like in a normal caching scenario, you want to avoid cache misses. In multilayered caching, we have two types of cache misses - close misses and complete misses. If your in-memory cache does not have the item but Redis does, this is a close cache miss. You will need to propagate the cache result back to your in-memory cache to achieve maximum performance.

You could do this via a background task however this wouldn't scale. It would require iterating all keys of one cache layer and comparing them to the keys in another.

To get the best benefit here, you will want only propagate the item if you actually need it. This keeps your in-memory cache as small as what it actually requires. Because we are having to fetch the item from the shared cache anyway, we can spend a few extra cycles storing it in our local in-memory cache.

The extra time spent storing it in our in-memory cache should pale in comparison to the time required for a complete cache miss.

Managing Evictions

Scenario: You have an in-memory cache and a filesystem cache for a single application instance

Depending on the in-memory caching solution, you might already have an auto-eviction system. This can be found, for example, in Microsoft's MemoryCache. Unlike what is available in something like Redis though, caching to a file is both extremely slow and doesn't have a method to auto-evict expired items.

While your code may consider expired cache items as "missed", its important to actually evict the expired records as they may be taking up precious space in memory, disk or a database. It seems pretty straight forward, loop over the items known to be in the cache and evict any expired records.

Its important to consider that some cache layer technologies may have optimizations that allow bulk eviction of records instead of individual evictions. For example, a database cache layer would likely be able to query all expired items at once and be able to run a single "delete" operation.

This bulk eviction "cleanup" is a good candidate for a background task - something where there are few instances of it and it can start the cleanup at regular intervals.

Background Refreshing (Stale vs Expired Cache Items)

Background refreshing isn't exclusive to a multilayer cache solution however it can be invaluable for maximising performance in one. The important part for background refreshes is working out the best time for refreshing. Refreshing too early may put an unnecessary strain on the data source however refreshing too late may have the data be overly stale.

The control of the refreshing is important too - you don't want to do this on a schedule as the cache may be overly eager. Like propagating between cache layers, you want to perform this if the cache item is actively being hit.

To keep throughput up, we need to simultaneously return our "stale" cache item while triggering a refresh to update our data. This update of data needs to hit every cache layer too so other application instances can benefit from the refreshed data.

Distributed Locking

Scenario: You have multiple instances of an application with their own local caches (in-memory) while also having a shared cache (Redis).

If you're looking at a multilayered caching solution, you likely are running multiple instances of your application. If "Web Server 1" is already attempting to update Redis then "Web Server 2" doesn't need to waste any time doing the same. This is important to factor especially if retrieving the original data is an expensive operation.

Distributed locking helps alleviate this however there is a catch - you don't want multiple requests on the same server checking the distributed cache every time for a lock. If the same server already has a lock, you will want to track that locally in-memory so the lock-check is faster.

Summary

Layered caching can provide the best of multiple different cache types. You can get the performance of in-memory cache with the larger cache sizes from a Redis instance, database or file system. It won't automatically solve every caching performance problem but in the right scenarios, can be an extremely useful tool.

I hope these tips can help you out with your own caching solution. If you don't want to roll your own, check out my library Cache Tower which supports these things and more.