An Extensive Examination of LINQ: Extending LINQ  Adding Query Operators
By Scott Mitchell
A Multipart Series on LINQ 

This article is one in a series of articles on LINQ, which was introduced with .NET version 3.5.

Introduction
As discussed in earlier installments of this article series  most notably in An Introduction to LINQ and The Standard Query Operators  one of LINQ's primary components is its set of standard query operators. A query operator is a method that operates on a sequence of data and performs some task based on that data, are implemented as extension methods on types that implement the
IEnumerable<T>
interface. Some of the standard query operators that we've
explored throughout the articles in this series include: Count
, Average
, First
, Skip
, Take
, Where
,
and OrderBy
, among others.
While these standard query operators provide a great detail of functionality, there may be situations where they fall short. The good news is that it's quite easy to create
your own query operators. Underneath the covers query operators are just methods that extend types that implement IEnumerable<T>
and iterate over the
sequence performing some task, such as computing the total number of items in the sequence, computing the average, filtering the results, or ordering them. This article examines
how to extend LINQ's functionality by creating your own extension methods. Read on to learn more!
A Quick Primer on Query Methods
In a prior article, The Ins and Outs of Query Operators, we looked at the underlying functionality of query methods. (If you have not yet read that article I strongly suggest reading it before continuing on with this one.) Recall that, in a nutshell, query methods iterate over a sequence of data; specifically, a query operator iterates over a sequence that implements
IEnumerable<T>
, which includes arrays, lists,
stacks, queues, dictionaries, and ADO.NETrelated classes like DataRowCollection
, among many other types.
Query operators can be classified as to what type of value they return. For instance, some standard query operators return a scalar value based on some aggregating function (such as determining the maximum value in the sequence), while others return a single element from the sequence or a new sequence altogether. All query operators can be classified into one of the following categories:
 Aggregate operators  aggregate operators return a scalar value based on some aggregating operation. For instance, the standard query operators
Sum
andAverage
are examples of aggregate operators because they return a scalar value (a number) based on an aggregate operation (summing or averaging some value in the underlying sequence)  Single element operators  query operators that return precisely one element from the underlying sequence.
First
andSingle
are examples of standard query operators that return a single element.  Sequence operators  sequence operators return a sequence. The returned sequence may be a subset of the underlying sequence, some modification of the
underlying sequence, or an entirely new one. The
Where
standard query operator is an example of a sequence operator as it returns the elements in the underlying sequence that match a particular filter criteria.  Grouping operators  returns a group of sequences from a single underlying sequence.
GroupBy
is an example of a grouping standard query operator.
Creating Standard Deviation and Variance Aggregate Query Operators
When writing a SQL query you can utilize a number of aggregate functions, such as
AVG
, COUNT
, MIN
, MAX
, and SUM
.
The standard query operators include similar aggregate operators; however, SQL offers a number of statistical aggregate functions not found in the standard query operators, including:
STDEVP
 computes the standard deviation of all values in a specified expression, andVARP
 computes the variance of all values in a specified expression.
 Compute the average value of the sequence.
 For each element in the sequence determine the difference between the number and the average computed in step (1).
 Square each of the differences determined in step (2) and sum these numbers.
 Divide the result in step (3) by the number of elements in the sequence.
The following extension methods compute the variance and standard deviation for a sequence of decimal values. (I've included just the C# version of these query operators here in this article; download the code available at the end of this article to see the Visual Basic version of these query operators.)
public static double Variance(this IEnumerable<decimal> source)

As you can see, these two methods are implemented as extension methods on a type that implements the IEnumerable<decimal>
interface. (To compute
the standard deviation and variance on sequences of integers, doubles, and other numeric types you'd need to create additional Variance
and StdDeviation
methods that applied to types of IEnumerable<int>
, IEnumerable<float>
, and so on.) The Variance
method starts by
computing the average using the Average
standard query operator. Next, it enumerates the elements in the underlying sequence (source
) and,
for each element, determines the square of the number less the average value. These numbers are summed into a variable named runningSum
, which is then
divided by the number of elements in the sequence. This is the variance.
The StdDeviation
method computes the variance of the sequence using the justdefined Variance
method and returns its square root.
With these query operators complete they can now be used on any sequences of decimals, much like how the aggregate standard query operators can be used. The demo includes
a demo that allows the user to type a commadelimited list of numbers into a textbox. These numbers are split apart and fed into a list of decimal values named
sequenceOfDecimals
. The Variance
and StdDeviation
query operators are then used to display these statistical metrics.
The code snippet below shows how the Variance
and StdDeviation
query operators can be called like any other query operator on an object that
implements IEnumerable<decimal>
. The screen shot below the code snippet shows the output when viewed through a browser.
// Display the standard deviation and variance of the sequence of decimals entered by the user

Returning Elements of a Sequence in Random Order
Given a sequence of elements how would you return, say, three random elements? Or all of the elements but in a random order? There is no standard query operator for randomizing a sequence, so let's create one!
As noted in Techniques for Randomly Reordering an Array, it's very easy to write a shuffle algorithm that does not generate truly random shuffles. A naive implementation can easily overweight certain permutations. One fairly simple algorithm that avoids such pitfalls is the FisherYates shuffle. the FisherYates shuffle randomly reorders the elements in an array in such a way that the various permutations are equally likely.
A thorough description of the FisherYates shuffle is beyond the scope of this article; refer to Techniques for Randomly Reordering an Array for more information. When preparing for this article I came across a code sample on StackOverflow by user LukeH that implements the FisherYates shuffle as a query operator. Here is the code (which you can also find in the downloadable demo):
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)

Another approach for getting the elements of a sequence in random order is to use the OrderBy
standard query operator sorting on a new
GUID. The code for this is much simpler, as it relies on an existing standard query operator:
public static IEnumerable<T> Shuffle2<T>(this IEnumerable<T> source)

Either one of these two methods will do the trick. The demo available for download includes both.
To get one or more random elements from a sequence using either one of these shuffle query operators you would write like the following:
// Get a single random element...

In fact, you could very easily create another query method that returns a single random element  you would create an extension method on IEnumerable<T>
that
would return an object of type T
and could do so with one line of code: return collection.Shuffle().First();
. (The demo available for
download includes such a query operator named Random
.)
To see the Shuffle
query operator in action, check out the Shuffle.aspx
demo, which is part of the download at the end of this article.
Much like the variance and standard deviation demo, this one prompts the user to enter a sentence. The words in the sentence are then read into an array and the
Shuffle
query operator is used to randomize their order. Finally, this randomized orderings is displayed.
Conclusion
LINQ offers a variety of standard query operators that can be used to compute aggregate values, retrieve a single element, retrieve a sequence of elements, or even generate groups from a sequence. While these standard query operators may seem mystically powerful, there is no magic going on. Query operators are simply extension methods on types that implement
IEnumerable<T>
. As we saw in this article, you can create your own query operators with just a dash of code and a sprinkle of imagination.
Happy Programming!
Further Readings: