More Fun with Caching

In the last post we had some fun playing around with an object-oriented implementation of a simple cache. Today, I want to have some more fun building a cache - but this time, we’ll make things more functional.

This post is a sequel to the last post, you might want to go back and read the motivation if you haven’t already.

Higher-Order Functions

A higher-order function is a function that accepts a function as a parameter or returns a function as result.

Alright, but how can we use that for our cache implementation? Do you recall the object composition implementation? We had the following interface:

public interface Executor<TKey, TValue>
{
    TValue Execute(TKey key);
}

That’s a strange interface, don’t you think? It contains only one method and both the interface and the method have almost the identical name. What’s the purpose of that interface? We need some way to make sure that the object someone’s passing to our cache constructor has a public method called ‘Execute’. In fact, we’re only interested in the method. The interface is a mere necessity - there’s no other way to express it in a pure object-oriented language.

But C# isn’t a pure object-oriented language - there’s a lot of functional influence.

Delegates

C# supports higher-order functions since its early days. Functions that can be passed around like objects are called delegates. Here’s how we can implement the cache using a delegate instead of an interface:

public delegate TValue Execute<TKey, TValue>(TKey key);

public class Cache<TKey, TValue> where TKey : notnull
{
    readonly Execute<TKey, TValue> Executor;
    readonly Dictionary<TKey, TValue> entries = new Dictionary<TKey, TValue>();

    public Cache(Execute<TKey, TValue> executor)
    {
        Executor = executor;
    }

    public TValue GetValue(TKey key)
    {
        if (!entries.TryGetValue(key, out TValue value))
        {
            value = Executor(key);
            entries.Add(key, value);
        }
        return value;
    }
}

There’s no need for an interface anymore. The constructor parameter executor is a delegate now. That simplifies the usage of the cache. We just pass the method we want to be cached when we are composing the cache instance:

var squareCache = new Cache<int, int>(CalculateSquare);
PrintSquares(numbers, squareCache);

Having an explicit delegate definition is a good thing in certain scenarios. You can assign a meaningful name to the delegate itself and to every parameter of the delegate. However, in other scenarios there isn’t a point in having an explicit delegate. Let’s take the delegate from our example. Execute isn’t a particularly meaningful name, and key isn’t either. In such situations, it’s better to use one of the many general-purpose delegates of the framework.

Here’s the cache implementation with a general-purpose Func delegate:

public class Cache<TKey, TValue> where TKey : notnull
{
    readonly Func<TKey, TValue> Executor;
    readonly Dictionary<TKey, TValue> entries = new Dictionary<TKey, TValue>();

    public Cache(Func<TKey, TValue> executor)
    {
        Executor = executor;
    }

    public TValue GetValue(TKey key)
    {
        if (!entries.TryGetValue(key, out TValue value))
        {
            value = Executor(key);
            entries.Add(key, value);
        }
        return value;
    }
}

When there’s no benefit in using an explicit delegate, you should use a general-purpose delegate. The usage of the cache is the same as with the explicit delegate.

That’s it. We’ve seen many different ways to implement a cache, but that’s ultimately how I would write a simple cache in C#. What do you think about it? How would you implement a cache?

F#

When we’re talking about higher-order functions already, why not take it one step further and talk about a functional cache implementation? From here on, I switch to F#.

Cache Class

First, let’s translate the final version of the C# implementation to F# and see how it looks:

type Cache<'key, 'value when 'key : equality> (executor) =
    let entries = Dictionary<'key, 'value> ()
    
    member _.GetValue key =
        match key |> entries.TryGetValue with
        | true, value -> value
        | _ ->
            let value = key |> executor
            (key, value) |> entries.Add
            value

As you can see, it’s absolutely possible to write classes in F#. Just like C# isn’t a pure object-oriented language, F# isn’t a pure functional language.

A quick remark: F# isn’t made to write classes, nevertheless, the F# version is more compact than the C# version. That’s something you experience all the time when you compare F# code to C# code: the C# code is more verbose. One reason is the better type inference of F#. In a lot of cases, the compiler can figure out the correct type itself, you don’t have to sprinkle type information all over your code like you have to with C#.

Cache Method

Although it’s possible to write a cache class in F#, it’s not very functional. Maybe we should start from scratch. Here’s calculateSquare in F# (I add the method signatures in comments for clarity):

//  int -> int
let calculateSquare number = number * number

How do we cache that? In functional programming, we don’t like side effects. In the cache class, entries gets modified every time there’s a cache miss. That’s a side effect. If we want no side effects, we can’t modify the cache. We have to return a new cache based on the old cache but including the new cached item. A cache method might look like this:

//  ('a -> 'b) -> Map<'a, 'b> -> 'a -> Map<'a, 'b>
let cache executor entries key =
    if entries |> Map.containsKey key then
        entries
    else
        entries |> Map.add key (key |> executor)

Here, the cache is just a Map. When we call cache, we pass the current cache as the parameter entries and the result of cache is the new cache. If there’s already a cache entry in entries for key, cache simply returns entries. Else, cache invokes executor and adds the result to entries - the result of Map.add is new Map that includes all the old entries plus the new entry. That new Map is returned.

Quick question: Is cache pure? That depends on executor. If executor is pure, cache is pure. If executor is impure, cache is impure.

Let’s have a look at how you can use the cache method:

//  Map<int, int> -> (seq<int> -> Map<int, int>)
let calculateSquares squares =
    Seq.fold (cache calculateSquare) squares

//  Map<int, int> -> seq<int> -> unit
let printSquares squares numbers =
    calculateSquares squares numbers
    |> Map.iter (fun number square -> printf "%d * %d = %d" number number square)

If you don’t know F#, you might notice the lack of type information. As I already mentioned, the F# type inference can figure out the right types on its own in a lot of cases.

But what’s the advantage of this approach?
First, it’s very compact and direct. Every other approach is way more verbose.
Second, it’s easy to test. The cache is nothing but a simple data structure - it’s a Map. If you want to test the code with an empty cache, pass an empty Map. If you want to test the code with a populated cache, set up a Map accordingly and use it as parameter squares.

How can you test your code with the cache class approach? How do you populate a cache object? There are basically two ways.
First, you can populate the cache by calling GetValue and thereby populate the cache as a side effect. However, if the cached action takes a relevant time to execute, that increases the time your test takes to run significantly.
Second, you can use a mock object instead of the real cache object. That’s how you usually do it. But mocking an object comes with a not to be underestimated overhead. You likely have to define an additional interface ICache and change your code to reference ICache instead of Cache. Then, you can use the real cache object in production and a mocked cache for testing. But we’ll talk more about testing in another post.

So … when the functional approach has such desirable properties, you might ask: Can we do the same with C#? Yes, but … have a look:

public static int CalculateSquare(int number) => number * number;

public static ImmutableDictionary<TKey, TValue> Cache<TKey, TValue>(
    Func<TKey, TValue> executor, ImmutableDictionary<TKey, TValue> entries,
    TKey key) =>
        entries.ContainsKey(key) ?
            entries :
            entries.Add(key, executor(key));

public static ImmutableDictionary<int, int> CalculateSquares(
    ImmutableDictionary<int, int> squares, IEnumerable<int> numbers) =>
        numbers.Aggregate(squares,
            (squares, number) => Cache(CalculateSquare, squares, number));

public static void PrintSquares(ImmutableDictionary<int, int> squares,
    IEnumerable<int> numbers)
{
    foreach (var (number, square) in CalculateSquares(squares, numbers))
    {
        Console.WriteLine($"{number} * {number} = {square}");
    }
}

Here’s the F# code again for an easier comparison with the C# code:

let calculateSquare number = number * number

let cache executor entries key =
    if entries |> Map.containsKey key then
        entries
    else
        entries |> Map.add key (key |> executor)

let calculateSquares squares =
    Seq.fold (cache calculateSquare) squares

let printSquares squares numbers =
    calculateSquares squares numbers
    |> Map.iter (fun number square -> printf "%d * %d = %d" number number square)

Which one do you prefer?

As you can see, it’s absolutely possible to do the same with C#, but the code is way more verbose. To be honest, I find the C# code to be very difficult to read. If you want to write functional code, I strongly suggest using a functional language.

What do you prefer? The object-oriented or the functional approach? Or would you do it completely different?