.NET

DIY – Useful LINQ Extensions

LINQ (Language Integrated Query) is one of the most powerful features of modern .NET. Powered by generics, lambda expressions, method chaining, extension methods, and deferred execution it allows to write extremely concise code when dealing with collections.

In this post we will look some useful LINQ extensions I have written over the years to make my work with LINQ even easier and quicker, and help me simplify my code.

 

Advantages and Limitations of LINQ

Naturally, LINQ is not applicable to every single problem. However, in many cases it can simplify otherwise complicated code.

Additionally, it can serve as a common base line to make code more readable. Every developer – including your future self – who is familiar with LINQ has an edge on reading and understanding code that uses it. If we instead reimplement shared functionality time and time again – and probably give it different names, and different method signatures – someone reading the code may have to invest a lot more effort into understanding it. Of course, the same can be argued for any kind of abstraction or other sharing and reuse of code.

Despite these positive aspects, LINQ is not perfect. There are a number of things that seem like they should be possible, for are not. Or maybe they are only possible using complicated and not immediately clear code.

Some of these cases are what we will talk about below.

Yield()

Lambda expressions are another great feature of C#. They really are what makes the beauty of LINQ possible in the first place. Unfortunately, there are still some things they cannot do.

One such example is using the yield keyword, which allows us to implicitly return an enumerable object without having to go through the trouble of creating a collection.

In many cases we can use multiple yield keywords to return entire sequences of objects. This is something that we will not be able to easily reproduce in a lambda expression however.

In some other cases, we only want to return a sequence with a single object however – be it because a method call requires this, because we want to continue treating the object as a sequence instead, or because we want to throw it into a collection with other sequences (these are really all just different ways of saying the same thing).

Consider for example the following contrived example (and ignore that we could use Take(1) in this case):

var singletonSequence = input.First()
    .Select(obj =>
        {
            var transformed = aMethod(obj);
            // return singleton sequence of 'transformed' here
        }
    );

There are many ways we could finish the lambda expression.

For example, we could return new[] { transformed }; – return an array with the object as single element. While working fine, this is a somewhat ugly solution however, and it may not always communicate the intent correctly.

Alternatively, we could replace the entire lambda expression by another method:

IEnumerable<TransformedType> aMethodEnumerated(OriginalType obj)
{
    yield return aMethod(obj);
}
var singletonSequence = input.First()
    .Select(aMethodEnumerated);

This will result in nicer – and shorter – code, however unless we name the method very clearly, it might be more confusing than returning an array.

Unless the method can be named in an extremely clear way to express the semantic intent, I would consider this a bad solution – especially if the body of the lambda expression is more complicated than a single method call.

What if we try to combine these two ways of doing the same thing somehow?

Can we express the semantics of ‘turn this into a sequence’?

The answer is trivially yes. We can simply write a method doing exactly this:

IEnumerable<T> TurnIntoEnumerable<T>(T obj)
{
    yield return obj;
}
var singletonSequence = input.First()
    .Select(obj =>
        {
            var transformed = aMethod(obj);
            return TurnIntoEnumerable(obj);
        }
    );

And while we are at it, let us rename the method into Yield, to show how it emulates the actual yield keyword. We can also turn it into an extension method, which I think results in much nicer syntax:

public static IEnumerable<T> Yield<T>(this T obj)
{
    yield return obj;
}
var singletonSequence = input.First()
    .Select(obj =>
        {
            var transformed = aMethod(obj);
            return obj.Yield();
        }
    );

Note how we already used generics, instead of specific types like before. This allows us to turn any object into a singleton sequence whenever we need.

Another advantage of this method – next to its very specific function – is that its implementation is completely hidden from the user. So if we wanted to, we could in fact return a one-element array, if we prefer.

The last advantage I would like to point out (and this is really something that is possible with a number of generic methods): Instead of letting the compiler infer the type for our generic method call, we can specify it. In doing so, we can not only specify the actual type of the object – the compiler would have done so anyway – but also any super/base-type, which result in a sequence of that type. This can be a useful feature to assist in a number of circumstances.

Append(), Prepend()

While we are dealing with enumerating individual elements, another thing that I have found myself wanting to do several times is appending or prepending single elements to sequences for later enumeration.

This is really possible using LINQ directly. It only supports concatenating two sequences.

Of course, we just wrote a method to turn an object into a sequence, so we could simply implement such methods as:

public static IEnumerable<T> Append<T>
    (this IEnumerable<T> sequence, T item)
{
    return sequence.Concat(item.Yield());
}
public static IEnumerable<T> Prepend<T>
    (this IEnumerable<T> sequence, T item)
{
    return item.Yield().Concat(sequence);
}
var sequenceAndOne = sequence.Append(one);
var oneAndSequence = sequence.Prepend(one);

However, while somewhat elegant, this also seems somewhat wasteful. There really is no need for the call to Yield() and the implicit creation of two enumerable objects – one for Yield() and one for Concat().

Instead, we can easily implement the concatenation ourselves:

public static IEnumerable<T> Append<T>
    (this IEnumerable<T> sequence, T item)
{
    foreach (var t in sequence)
        yield return t;
    yield return item;
}
public static IEnumerable<T> Prepend<T>
    (this IEnumerable<T> sequence, T item)
{
    yield return item;
    foreach (var t in sequence)
        yield return t;
}

Unfortunately, there is no yield foreach statement that we can make use of hide the foreach loop. Under the circumstances, this is then the best we can do.

Note that we may want to check the sequence parameter for being null and throw an ArgumentNullException in that case, if we wanted to closer mimic the behaviour of actual LINQ methods:

public static IEnumerable<T> Append<T>
    (this IEnumerable<T> sequence, T item)
{
    if (sequence == null)
        throw new ArgumentNullException("sequence");
    foreach (var t in sequence)
        yield return t;
    yield return item;
}

Since we return the enumerable using the yield keyword, this method automatically uses deferred execution, which can come in very handy, especially if we do not mean to enumerate the entire list.

However, this also means that the parameter is only checked for null once we start enumerating, and now when the LINQ query is constructed in our code.

This may not be a big issue, but if we want to make sure that the check is executed right away, we can do so by splitting the methods in two like this:

public static IEnumerable<T> Append<T>
    (this IEnumerable<T> sequence, T item)
{
    if (sequence == null)
        throw new ArgumentNullException("sequence");
    return appendDeferred(sequence, item);
}
private static IEnumerable<T> appendDeferred<T>
    (IEnumerable<T> sequence, T item)
{
    foreach (var t in sequence)
        yield return t;
    yield return item;
}

Now the public method will always check the parameter right away, but the actual enumeration will still be deferred. This may not be the nicest pattern, and it also requires another method call, but it is right now the only way to achieve this behaviour.

Conclusion

In this post we explored two functionalities that LINQ itself is missing, and implemented them ourselves. We end up with three LINQ-style extension methods, that can easily be used together with the already existing ones.

I hope this gave you some ideas for how to extend LINQ – or any similar framework – according to your needs.

There are still a number of other extension methods of the same style I have lying around, but those will have to wait until another post.

Enjoy the pixels!

Reference: DIY – Useful LINQ Extensions from our NCG partner Paul Scharf at the GameDev<T> blog.

Paul Scharf

Paul is a self-publishing game developer. He believes that C# will play an ever growing role in the future of his industry. Next to working on a variety of projects he writes weekly technical blog posts on C#, graphics, and game development in general.

Related Articles

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button