C# Fun with Caching

When was the last time you had some fun playing around with code? I hope it wasn’t too long ago. Unfortunately, there are a lot of professional developers who have lost fun and excitement somewhere along the way. Can you remember how much fun it all was when you started out? We shouldn’t lose that - despite all the deadlines, overflowing ticket queues and all the other stuff we have to face in our professional life.

So, in this post I want to play around with code and have some fun. Maybe, there’s even something we can learn ;-)

Motivation

Let’s start with the motivation for this post. Suppose we have the following code (I ignore exception handling to improve readability):

public static async IAsyncEnumerable<string> FetchResources(
    IAsyncEnumerable<Uri> resourceUris)
{
    await foreach (var resourceUri in resourceUris)
    {
        yield return await FetchResource(resourceUri);
    }
}

public static async Task<string> FetchResource(Uri resourceUri)
{
    using var client = new HttpClient();
    var response = await client.GetAsync(resourceUri);
    return await response.Content.ReadAsStringAsync();
}

FetchResources takes the URIs from the async stream resourceUris, calls FetchResource and returns the result as an item of another async stream. Suppose FetchResource takes a relevant time to execute and there might be duplicate URIs in resourceUris. Now, to improve performance, we want to cache the results of FetchResource - like your web browser caches pages. That’s the goal of this post: implement a simple cache.

I want to focus on the cache and not get distracted by async streams and HTTP requests. That’s why I’ll simplify the example. For the rest of the post, we’ll work with the following code - but keep the initial version in mind:

public static void PrintSquares(IEnumerable<int> numbers)
{
    foreach (var number in numbers)
    {
        var square = CalculateSquare(number);
        Console.WriteLine($"{number} * {number} = {square}");
    }
}

public static int CalculateSquare(int number) => number * number;

FetchResources becomes PrintSquares and FetchResource becomes CalculateSquare. Now, the goal is to cache the results of CalculateSquare. How would you do it?

Direct Approach

KISS, right? Add a static dictionary and use it in CalculateSquare:

public static readonly Dictionary<int, int> squares = new Dictionary<int, int>();
public static int CalculateSquare(int number)
{
    if (!squares.TryGetValue(number, out var square))
    {
        square = number * number;
        squares.Add(number, square);
    }
    return square;
}

Okay … okay … you could do that. But I wouldn’t recommend it. Or to put it different: Don’t do that!

First, introducing static state is a terrible idea. If you keep doing that, soon your system will be unmanageable. Who controls the cache? Eventually, you’ll have to remove items from the cache to limit memory usage - keep in mind, CalculateSquare is only a placeholder for a more complex method. And btw, if you use a static Dictionary, CalculateSquare is no longer thread safe. You would need to use a ConcurrentDictionary or to implement locking.
Second, CalculateSquare does two things now. Calculate the square number and cache the result. But methods should do only one thing. In this case, the actual calculation and the caching are two separate things. In a real-life scenario, we want to have control over the caching strategy independent of the executed method. Do we want to cache at all? How long do we want to keep the cached results? Do we want to have separate caches for separate concerns? You don’t have that control if you do the caching inside the method that should be cached.

So, let’s discard the direct approach and go back to the clean and simple version of CalculateSquare:

public static int CalculateSquare(int number) => number * number;

In that form, CalculateSquare truly does only one thing.

The Caller Caches

When we don’t want CalculateSquare to do the caching, why not move the cache one level higher and cache at the level of PrintSquares? That would look like this:

public static void PrintSquares(IEnumerable<int> numbers)
{
    var squares = new Dictionary<int, int>();

    foreach (var number in numbers)
    {
        if (!squares.TryGetValue(number, out var square))
        {
            square = CalculateSquare(number);
            squares.Add(number, square);
        }
        Console.WriteLine($"{number} * {number} = {square}");
    }
}

That’s better than the direct approach. At least we don’t have a static Dictionary. But, again, we don’t have control over the cache. There’s a new cache instance whenever you call PrintSquares. Now that’s easy to fix:

public static void PrintSquares(IEnumerable<int> numbers,
    Dictionary<int, int> squares)
{
    foreach (var number in numbers)
    {
        if (!squares.TryGetValue(number, out var square))
        {
            square = CalculateSquare(number);
            squares.Add(number, square);
        }
        Console.WriteLine($"{number} * {number} = {square}");
    }
}

Okay, now it’s up to the caller whether he or she passes and empty or a populated cache. That’s improvement.
You might ask, how much further up do we play the same game and make squares a method parameter? Simple answer: As far up as necessary. Eventually, someone has to initialize squares, but I don’t think it’s the responsibility of PrintSquares.
However, PrintSquares is still doing two things - print the square numbers and cache the results. Let’s separate those two things.

A Cache Object

Maybe your first intuition when faced with the challenge to implement caching was to create a cache class. That’s what I’m going to do now.

SquareCache

I want to begin with the obvious approach and define a SquareCache class:

public class SquareCache
{
    readonly Dictionary<int, int> squares = new Dictionary<int, int>();

    public int CalculateSquare(int number)
    {
        if (!squares.TryGetValue(number, out var square))
        {
            square = CalculateSquare(number);
            squares.Add(number, square);
        }
        return square;
    }
}

Now, we’ve extracted the code for caching and put in into a separate class. Let’s see how we can use SquareCache:

public static void PrintSquares(IEnumerable<int> numbers, SquareCache squareCache)
{
    foreach (var number in numbers)
    {
        var square = squareCache.CalculateSquare(number);
        Console.WriteLine($"{number} * {number} = {square}");
    }
}

PrintSquares looks good now. The caller has full control over the cache we use by passing the cache object squareCache and we got rid of the caching code inside of PrintSquares. I’m satisfied with how PrintSquares turned out - I don’t see the need for any further improvement.

But what about SquareCache? If all you want to cache is CalculateSquare, it’s alright. But what if you want to cache something else as well? Why not separate the caching code from the method we want to cache?

Inheritance

Let’s start with a classic object-oriented approach. An abstract base class and a derived concrete class:

public abstract class Cache<TKey, TValue> where TKey : notnull
{
    readonly Dictionary<TKey, TValue> entries = new Dictionary<TKey, TValue>();

    public TValue GetValue(TKey key)
    {
        if (!entries.TryGetValue(key, out TValue value))
        {
            value = Execute(key);
            entries.Add(key, value);
        }
        return value;
    }

    protected abstract TValue Execute(TKey key);
}

public class SquareCache : Cache<int, int>
{
    protected override int Execute(int number) => CalculateSquare(number);
}

That’s a common way to model something in an OO language. However, it’s a bit clumsy. You have to derive a new concrete class every time you want to cache something else. If you don’t have many different things to cache anyway, that might be ok.
Let’s have a look at how we would use the new version of SquareCache:

public static void PrintSquares(IEnumerable<int> numbers, SquareCache squareCache)
{
    foreach (var number in numbers)
    {
        var square = squareCache.GetValue(number);
        Console.WriteLine($"{number} * {number} = {square}");
    }
}

PrintSquares still looks good. But I’m not that satisfied with SquareCache. Let’s see what we can do to the cache implementation.

Composition

We can get rid of the class hierarchy if we use composition. That means, we have to define an appropriate interface and pass an object that implements that interface as a constructor parameter - that pattern is known as constructor dependency injection.

For our example, I define the interface Executor - great name, right? - and add a constructor requiring an Executor instance to the Cache class:

public interface Executor<TKey, TValue>
{
    TValue Execute(TKey key);
}

public class Cache<TKey, TValue> where TKey : notnull
{
    readonly Executor<TKey, TValue> Executor;
    readonly Dictionary<TKey, TValue> entries = new Dictionary<TKey, TValue>();

    public Cache(Executor<TKey, TValue> executor)
    {
        Executor = executor;
    }

    public TValue GetValue(TKey key)
    {
        if (!entries.TryGetValue(key, out TValue value))
        {
            value = Executor.Execute(key);
            entries.Add(key, value);
        }
        return value;
    }
}

The nice thing about this approach is, that you can implement the Executor interface wherever it fits best. You’re not forced into a class hierarchy.

First, let’s look at PrintSquares again. Now, we don’t have an explicit SquareCache class. PrintSquares has to deal with a more general Cache<int, int> parameter:

public static void PrintSquares(IEnumerable<int> numbers, Cache<int, int> squareCache)
{
    foreach (var number in numbers)
    {
        var square = squareCache.GetValue(number);
        Console.WriteLine($"{number} * {number} = {square}");
    }
}

In some situations, that might be a drawback. However, it’s simply a consequence of the gained flexibility: you don’t know how squareCache will be composed at compile-time, that happens at run-time. That’s why you call it object composition - in contrast to class inheritance.

For our example, I simply have a new class called SquareExecutor that implements Executor:

public class SquareExecutor : Executor<int, int>
{
    public int Execute(int number) => CalculateSquare(number);
}

Now, you have to compose the cache before you can use it:

var squareCache = new Cache<int, int>(new SquareExecutor());
PrintSquares(numbers, squareCache);

We’ll talk about class inheritance vs. object composition in many more posts to come. For now, I want to share with you a quote from Design Patterns by the “Gang of Four”:

Favor object composition over class inheritance.

That’s it for today’s post. There are no more ways I can think of how you could implement a cache in an OO langue. Do you know some more? But C# isn’t a pure OO langue, there’s an ever-growing functional influence. So … in the next post we’ll have some more fun with caching. But then, we’ll make things more functional.

Did you have some fun? I’m amazed by how many different things you can talk about with such a small example. As professionals, should we invest more of our time to talk about code?