C# LINQ from beginner to expert - Part 3

Quick Recap

Back in Part 2 we looked at the often misunderstood operation of SelectMany(). fmap, flatmap or bind for those from a functional programming background. You saw that it is actually a powerful building block for expressing different types of operations but using it to it's full potential takes practice and some lateral thinking, that is something you will just gain over time with use.

In this part I will knock off the other important building block operation along with a few operations built from it. That will leave us in a good place to start using LINQ to build up some complex operations in the next part.

Aggregate (Eager)

Aggregate() is a means to apply an operation over elements in a collection to produce a single result, otherwise know as reduce or fold. The result type depending on the operation being applied so could even be a new collection.

This operation is not part of the query syntax for LINQ so all the examples will be via lambda syntax

// Get the sum of a collection of integers
var ints = new[] {1, 1, 2, 3, 6, 9};
var sumStyle1 = ints.Aggregate((total, value) => total + value);
var sumStyle2 = ints.Aggregate(0, (total, value) => total + value);

// Concatenate strings with spaces between
var strings = new[] {"Listen", "to", "Death", "Grips"};

// The bad way, ever growing concat
var sentence = strings.Aggregate(current, value) + $"{current} {value"); 

// The better way, using string builder
var sentence = strings
    .Aggregate(
        new StringBuilder(), 
        (builder, next) => 
        {
            if (builder.Length > 0)
            {
                builder.Append(" ");
            }
            builder.Append(next);
        })
    .ToString();

So what is going on here? Aggregate() gets a function that takes some type of running total and a value and combines them into a new running total, this is applied to each element in the collection in turn. The source for the initial running total depends on the function signature used, for which there are three versions:

public static T Aggregate<T>(
    this IEnumerable<T> source, 
        Func<T, T, T> func);

In this version the first element in the source collection is taken as the seed value. For the rest of the collection the aggregation function is called on each element in turn, feeding in the seed or the last aggregation result to the next.

For a list 1, 2, 3, 4 and a sum operation the function calls are
func(1, 2) = 1 + 2 = 3
func(3, 3) = 3 + 3 = 6
func(6, 4) = 6 + 4 = 10

public static TR Aggregate<T>(
    this IEnumerable<T> source, 
        TR seed, Func<TR, T, TR> func);

In this version the see value is supplied so the aggregation function is just applied to all elements in the source. The source type does not need to be the same type as the elements in the source collection. This can be seen in the string builder sample above, the collection contains strings yet the result is a populated string builder.

public static TR Aggregate<T, TA, TR>(
    this IEnumerable<T> source,
    TA seed, Func<TA, T, T> func,
    Func<TA, TR> projection);

Personally I have not really found much use for the last function signature for Aggregate(). This is the same as the second function signature but with a final step that takes the aggregated value and performs some form of projection on it to convert to another value. That is a mouthful, so here are examples.

// This results in a string builder
var sentence = strings
    .Aggregate(
        new StringBuilder(), 
        (builder, next) => 
        {
            if (builder.Length > 0)
            {
                builder.Append(" ");
            }
            builder.Append(next);
        });

// This does a final step to call ToString() on the builder 
var sentence = strings
    .Aggregate(
        new StringBuilder(), 
        (builder, next) => 
        {
            if (builder.Length > 0)
            {
                builder.Append(" ");
            }
            builder.Append(next);
        },
        builder => builder.ToString());

The result does not always need to be a sum, the following picks the highest value in a collection

var ints = new[] {1, 1, 2, 3, 6, 9};
var max = ints.Aggregate(
    (currentMax, value) => (currentMax < value) ? value : currentMax);

I am sure by now you are able to understand why aggregate is an eager operation, it has to walk the full collection to produce the result, there is no way for it to be lazy. Ok, it is lazy in Haskell but then everything is lazy in Haskell :)

So Aggregate() provides a means to reduce a collection to a result of some type. It can he more verbose but LINQ solves that with a few of the common operations in a more succinct form.

Sum/Min/Max (Eager)

I have decided to cover all of these operations in one, they all operate using the same principles and can really be thought of as wrappers around Aggregate(). They are all just single minded in task and hence far cleaner to use.

var ints = new[] {1, 1, 2, 3, 6, 9};
var total = ints.Sum(); 
var min = ints.Min(); 
var max = ints.Max(); 

There are two flavors for each of these operations. I will show them for Sum() of int but the same pair exists for all three operations for each numeric types, including nullable numeric types:

public static int Sum(this IEnumerable<int> source);

public static int Sum<T>(
    this IEnumerable<T> source, Func<T, int> mapper);

The version with the mapper is the same as the following operation:

var total = people.Select(person => person.Age).Sum();

// You just fold it to this
var total = people.Sum(person => person.Age);

As they are all essentially Aggregate() they must be eager, they have to walk the entire list.

If Part 4 we will start to combine these operations to solve some complex tasks.

Until then, happy coding

Woz

H2
H3
H4
3 columns
2 columns
1 column
Join the conversation now