When you think ASP, think...
Recent Articles
All Articles
ASP.NET Articles
ASPFAQs.com
Message Board
Related Web Technologies
User Tips!
Coding Tips
Search

Sections:
Book Reviews
Sample Chapters
Commonly Asked Message Board Questions
JavaScript Tutorials
MSDN Communities Hub
Official Docs
Security
Stump the SQL Guru!
Web Hosts
XML
Information:
Advertise
Feedback
Author an Article
Jobs

ASP ASP.NET ASP FAQs Message Board Feedback ASP Jobs
 
Print this Page!
Published: Wednesday, April 1, 2009

An Extensive Examination of LINQ: The Standard Query Operators

By Scott Mitchell


A Multipart Series on LINQ
This article is one in a series of articles on LINQ, which was introduced with .NET version 3.5.

  • An Introduction to LINQ - provides an overview of the purpose of LINQ, its design goals, and core components.
  • Extension Methods, Implicitly Typed Variables, and Object Initializers - looks at three language enhancements to VB and C# that, in part, allow for LINQ's unique syntax and functionality.
  • Lambda Expressions and Anonymous Types - explores two more language enhancements to VB and C# that permit LINQ's unique syntax and functionality.
  • The Ins and Outs of Query Operators - learn how query operators provide a universal approach to querying and modifying enumerable collections of data.
  • The Standard Query Operators - explore LINQ's standard query operators, a suite of built-in query operators for working with enumerable data.
  • Using the Query Syntax - learn how to write and use C# and Visual Basic's new query syntax, which lets you write LINQ queries using SQL-like syntax.
  • Grouping and Joining Data - examines the standard query operators and query syntax used to group and join data.
  • Introducing LINQ to XML - provides an overview of working with XML data using the LINQ to XML API.
  • Querying and Searching XML Documents Using LINQ to XML - examines querying and filtering XML documents using the LINQ to XML API.
  • Extending LINQ - Adding Query Operators - shows how to extend the functionality of LINQ by adding your own query operators.
  • (Subscribe to this Article Series! )

    Introduction


    Query operators are methods that work with a sequence of data and perform some task based on the data. They are created as extension methods on the IEnumerable<T> interface, which is the interface implemented by classes that hold enumerable data. For example, arrays and the classes in the System.Collections and System.Collections.Generic namespaces all implement IEnumerable<T>. In The Ins and Outs of Query Operators we looked at how to create your own query operator that, once created, can be applied to any enumerable object.

    While it is possible to create your own query operators, the good news is that the .NET Framework already ships with a bevy of useful query operators. These query operators are referred to as the standard query operators and are one of the primary pieces of LINQ. The standard query operators include functionality for aggregating sequences of data, concatenating two sequences, converting sequences from one type to another, and splicing out a particular element from the enumeration. There are also standard query operators for generating new sequences, grouping and joining sequences, ordering the elements in sequences, filtering the data in a sequence, and partitioning the sequence.

    All together, there are more than 40 standard query operators. This article explores some of the more germane ones, giving examples of the standard query operator in use and examining its underlying source code. There are also several demos included in the download available at the end of the article. Read on to learn more!

    - continued -

    Standard Query Operator Overview and Classifications


    The standard query operators are a set of query operators that ship with the .NET Framework. Specifically, the standard query operators are defined in the Enumerable class, which is found in the System.Linq namespace. The standard query operators are extension methods on the IEnumerable<T> interface.

    Each standard query operator is classified as performing a particular type of operation. In previous installments we looked at the Count standard query operator, and talked about the Sum standard query operator. These two operators are examples of aggregate operators, as they take a sequence of data - a list of integers, let's say - and aggregate the data, returning some scalar value (the total number of integers or the sum of said integers in the case of Count and Sum).

    The standard query operators can be classified according to the following types of operations performed:

    • Aggregation operators
    • Concatenation operators
    • Element operators
    • Equality operators
    • Generation operators
    • Grouping operators
    • Joining operators
    • Ordering operators
    • Partitioning operators
    • Projection operators
    • Quantifiers operators
    • Restriction operators
    • Set operators
    This article explores some of the more interesting and useful standard query operators. For a complete list of the standard query operators check out HookedOnLinq.com's list of Standard Query Operators. For a nicely-formatted representation that makes a great companion on your desk or corkboard, check out LINQ's Standard Query Operators (PDF) compiled by Milan Negovan over at ASP.NET Resources.

    Summing, Averaging, Counting, and Finding Maximum and Minimum Elements


    The .NET Framework includes a number of aggregate standard query operators. These operators examine a sequence of data and compute a scalar value. For instance, the Count operator, which we've seen in previous installments, returns the total number of elements in the sequence. Other aggregate operators include Count, Max, Min, and Sum. A simple example follows, which shows using many of these operators on the Fibonacci class that we created in the preceding installment.

    // C# - Create a Fibonacci object holding the first 10 Fibonacci numbers
    Fibonacci fib = new Fibonacci(10);

    var count = fib.Count();
    var avg = fib.Average();
    var sum = fib.Sum();
    var minValue = fib.Min();
    var maxValue = fib.Max();


    ' VB - Create a Fibonacci object holding the first 10 Fibonacci numbers
    Dim fib As New Fibonacci(10)

    Dim count = fib.Count()
    Dim avg = fib.Average()
    Dim sum = fib.Sum()
    Dim minValue = fib.Min()
    Dim maxValue = fib.Max()

    Keep in mind that the Count, Average, Sum, Min, and Max methods used above are not part of the Fibonacci class. Rather, they are extension methods on the IEnumerable<T> interface, which the Fibonacci class implements. Furthermore, notice how I used implicit variable typing when reading back the values from these operators (var count = fib.Count() and Dim count = fib.Count(), for example). I could have used explicit typing - int count = fib.Count() and Dim count As Integer = fib.Count - but it's good to get used to implicit typing as this pattern is commonly used with more intricate LINQ queries.

    The source code for the aggregation operators are pretty straightforward. For example, the Enumerable class defines two overloads of the Count operator. The first works on an object that implements IEnumerable<T>, and returns an integer value. It's abbreviated code follows. (Note: I've simplified the method declaration to make it more readable. I used Reflector to view the source code in the .NET Framework.)

    // C#
    public int Count(this IEnumerable<T> source)
    {
       int tally = 0;
       
       IEnumerator<T> enumerator = source.GetEnumerator();
       while (enumerator.MoveNext())
          tally++;
       
       
       return tally;
    }


    ' VB
    <Extension()> _
    Public Function Count(ByVal source As IEnumerable(Of T)) As Integer
       Dim tally As Integer = 0
       
       Dim enumerator As IEnumerator(Of T) = source.GetEnumerator()
       While enumerator.MoveNext()
          tally += 1
       End While
       
       Return tally
    End Function

    In the examples above, the IEnumerable<T> object named source that appears to be passed into the method is actually the object the extension method is being applied to. The Count method simply enumerates the elements in source, tallies how many iterations it performs, and returns this value. That's it!

    The other Count overload accepts a function as input, which you can use to filter what elements get counted. For example, to instruct the Count operator to only count odd numbers you could do something like: var count = fib.Count(n => n % 2 == 1) or Dim count = fib.Count(Function(n) n Mod 2 = 1).

    The aggregate operators are examples of greedy query operators. As we discussed in The Ins and Outs of Query Operators, LINQ operators are either lazy or greedy. A lazy query operator is one that is not evaluated until the elements of the sequence are enumerated. The sequence can be enumerated either by a foreach loop or by the application of a greedy query operator. Point being, when a greedy query operator is applied to a sequence the value computed by the greedy operator is generated immediately. The source code snippet above shows how the Count method immediately enumerates its source. This is why it is considered a greedy operator.

    The Count method can work with an enumerable object of any type. Other operators limit the types they can be applied to. For example, the Average operator can only be applied to numeric sequences. This restriction is imposed by having a variety of overloads defined in the Enumerable class for the Average method. Rather than having a single method that applies to objects of IEnumerable<T>, there are overloads for Average like:

    • Average(this IEnumerable<int> source)
    • Average(this IEnumerable<decimal> source)
    • Average(this IEnumerable<double> source)
    • And so on...
    There is a more general overload of the Average operator, but in order to use it you must supply a method that returns the numerical value for the element that will be used in the average calculation. This overload is useful if you have a collection of objects that contain a numeric value you want to average. For example, imagine that we have a list of Employee objects, where each Employee instance has a Salary property. The following pseudo code would compute the average salary:

    // C# - compute the average Employee salary
    List<Employee> emps = ...;
    var averageSalary = emps.Average( p => p.Salary );


    ' VB - compute the average Employee salary
    Dim emps As List(Of Employee) = ...
    Dim averageSalary = emps.Average( Function(p) p.Salary )

    The above code assumes that there's some process that returns a populated list of Employee objects. The average salary is then computed. Because the Employee object itself cannot be averaged (as it's not a numeric type) we need to pass a method into the Average operator that provides the value to average for each Employee object, in this case the value of each Employee object's Salary property. The net result is that we compute the average salary of all employees in the emps list.

    Conversion Operators


    The .NET Framework includes a handful of operators for sequence conversion. The ToList and ToArray operators convert an enumerable object of type T into a List<T> or an array of type T, respectively. These two methods are most often used to force a lazy query operator to evaluate. In the previous installment we talked about how a lazy query operator is not evaluated until the source elements are enumerated. To force immediate execution of the query operators you can use ToList or ToArray.

    Consider the example from the previous installment. In the code below we have a Fibonacci object, fib, that is initialized to having 10 elements. A query, oddFibs, is defined that works with the odd numbers. However, before the query is enumerated the Grow method is called, which doubles the number of elements in fib. When oddFibs is enumerated in the foreach loop the output contains the odd numbers of the first 20 Fibonacci numbers, and not the first 10.

    // Create a Fibonacci object with the first 10 Fibonacci numbers
    Fibonacci fib = new Fibonacci(10);

    // Create a query operator that works with the odd numbers in fib
    var oddFibs = fib.Where(x => x % 2 == 1);

    // Double the size of fib
    fib.Grow();

    // Output the values returned by oddFibs
    foreach(int fibValue in oddFibs)
       output fibValue...

    To force the oddFibs query to evaluate immediately (rather than waiting for it to be enumerated) you could use the ToList or ToArray operator like so:

    // Create a Fibonacci object with the first 10 Fibonacci numbers
    Fibonacci fib = new Fibonacci(10);

    // Create a query operator that works with the odd numbers in fib
    var oddFibs = fib.Where(x => x % 2 == 1).ToList();

    // Double the size of fib
    fib.Grow();

    // Output the values returned by oddFibs
    foreach(int fibValue in oddFibs)
       output fibValue...

    The foreach loop in the above code would output the odd numbers in the first 10 Fibonacci numbers because the ToList call converted the query into a list of integers, namely a list of integers that compose the odd integers in fib, of which there are only 10 Fibonacci numbers in it at this time. Keep in mind that oddFibs is a different type in both examples. In the first example, oddFibs is of type IEnumerable<int>. In the second example, the ToList operator converts the IEnumerable<int> sequence returned by the Where operator into a List<int>.

    Element Operators


    The element standard query operators retrieve a particular element from a sequence. The simplest operators in this class are First and Last, which return the starting and ending elements in the sequence, respectively. The following code snippet uses these two operators to retrieve the smallest and largest values in the Fibonacci collection. (Note that First and Last do not necessarily return the smallest and largest valued elements in a sequence; they do so for the Fibonacci sequence because the Fibonacci numbers are monotonically increasing.)

    // C# - Create a Fibonacci object holding the first 10 Fibonacci numbers
    Fibonacci fib = new Fibonacci(10);

    var smallestFib = fib.First();
    var largestFib = fib.Last();


    'VB - Create a Fibonacci object holding the first 10 Fibonacci numbers
    Dim fib As New Fibonacci(10)

    Dim smallestFib = fib.First()
    Dim largestFib = fib.Last()

    Use the ElementAt operator to get the element at a particular location in the enumeration, where the enumeration is indexed starting at zero. The following snippet verifies that the sum of the third and fourth Fibonacci numbers equals the fifth.

    // C# - Ensure that the third and fourth Fibonacci numbers sum up to the fifth number
    var thirdFib = fib.ElementAt(2);
    var fourthFib = fib.ElementAt(3);
    var fifthFib = fib.ElementAt(4);

    if (thirdFib + fourthFib != fifthFib)
       throw new ApplicationException("Leonardo Fibonacci is spinning in his grave right now!!");


    ' VB - Ensure that the third and fourth Fibonacci numbers sum up to the fifth number
    Dim thirdFib = fib.ElementAt(2)
    Dim fourthFib = fib.ElementAt(3)
    Dim fifthFib = fib.ElementAt(4)

    If thirdFib + fourthFib <> fifthFib Then
       Throw New ApplicationException("Leonardo Fibonacci is spinning in his grave right now!!")
    End If

    Ordering Operators


    The .NET Framework includes query operators for ordering enumerations. The OrderBy operator orders an enumeration in ascending order; OrderByDescending orders an enumeration in descending order. When ordering an enumeration you must provide a method as an input parameter to the operator that specifies the field by which the elements in the sequence are to be ordered by. For example, if you have a list of Employee objects and you want to order them by salary in ascending order, you could use code like the following:

    // C# - compute the average Employee salary
    List<Employee> emps = ...;
    var leastToMostExpensive = emps.OrderBy( p => p.Salary );


    ' VB - compute the average Employee salary
    Dim emps As List(Of Employee) = ...
    Dim leastToMostExpensive = emps.OrderBy( Function(p) p.Salary )

    The method passed into the OrderBy operator indicates that each Employee object should be ordered by the Salary property.

    If you are ordering a sequence of primitive types that do not have any properties (such as ordering a list of integers or an array of string) you still need to pass in a method indicating the value to order on, but the format would look like x => x or Function(x) x. For example, to order a Fibonacci object in descending order you'd do:

    // C# - Create a Fibonacci object holding the first 10 Fibonacci numbers
    Fibonacci fib = new Fibonacci(10);

    // Order in descending order
    var bigToSmall = fib.OrderByDescending(x => x);


    ' VB - Create a Fibonacci object holding the first 10 Fibonacci numbers
    Dim fib As New Fibonacci(10)

    ' Order in descending order
    Dim bigToSmall = fib.OrderByDescending(Function(x) x)

    The ordering operators include an overload where you can pass in a comparer method that given two elements in the sequence specifies how the two relate - if they are equal or not, and if not then what element comes before the other. If provided, this method is used by the ordering operators. You must provide such a method if the field you are ordering by does not have a built-in comparer. (Types like integers, strings, and dates already have comparers defined in the .NET Framework.)

    Partitioning Operators


    Previous installments looked at the Where operator, which enables a developer to specify a condition and filter out all elements from a sequence that do not meet that condition. We'll look at the Where operator momentarily, but before we do let's first focus on the partitioning operators. The partitioning operators divide the sequence into two partitions with a "left partition" and a "right partition." The two simplest partitioning operators are Skip and Take, which skip over the first n elements or take the first n elements. The following code snippet shows how to use Skip to skip over the first three Fibonacci numbers.

    // C# - Create a Fibonacci object holding the first 10 Fibonacci numbers
    Fibonacci fib = new Fibonacci(10);

    // Ignore the first three numbers
    var fibWithFirstThreeRemoved = fib.Skip(3);


    'VB - Create a Fibonacci object holding the first 10 Fibonacci numbers
    Dim fib As New Fibonacci(10)

    'Ignore the first three numbers
    Dim fibWithFirstThreeRemoved = fib.Skip(3)

    The fibWithFirstThreeRemoved enumeration (currently) contains the 4th, 5th, 6th, 7th, 8th, 9th, and 10th Fibonacci numbers.

    The SkipWhile and TakeWhile operators partition the sequence until some condition is true. We could replace the above Skip(3) operator with the SkipWhile operator like so:

    // C# - Skip over the Fibonacci numbers while the current number is less than or equal to 2
    var fibWithFirstThreeRemoved = fib.SkipWhile(n => n <= 2);


    ' VB - Skip over the Fibonacci numbers while the current number is less than or equal to 2
    Dim fibWithFirstThreeRemoved = fib.SkipWhile(Function(n) n <= 2)

    Keep in mind that the n in the lambda expression is the current Fibonacci number being evaluated and does not have any bearing on the index of the element in the sequence. The first four Fibonacci numbers are 1, 1, 2, and 3. The SkipWhile operator evaluates each element from the beginning and skips over it if the method evaluates to True. Therefore, it skips over the first three elements - 1, 1, and 2 - but not the third - 3 - because the first three are less than or equal to 2, but the third one is not.

    Restriction (Filtering) Operators


    The standard query operators include a single restriction (or filtering) operator: Where. The Where operator accepts a method as its input that specifies the condition for inclusion. When enumerated, the operator applies the condition to each element in its source; if the condition holds, the element is included in the resultset, otherwise it is filtered out.

    The following snippet starts by getting the list of files in the current folder. It then uses the Where operator along with the Sum and Average operators to glean information about the amount of space taken up by the files and by certain types of files. (For more information on how to programmatically work with the file system from an ASP.NET page, be sure to consult the System.IO namespace FAQs over on ASPFAQs.com.)

    // C# - Get the list of files in the current folder
    DirectoryInfo dirInfo = new DirectoryInfo(Path.GetDirectoryName(Request.PhysicalPath));
    FileInfo[] fInfo = dirInfo.GetFiles();

    var fileCount = fInfo.Count();
    var totalDiskSpace = fInfo.Sum(f => f.Length);
    var aspxFiles = fInfo.Where(f => Path.GetExtension(f.Name.ToLower()) == ".aspx");
    var aspxFileCount = aspxFiles.Count();
    var aspxFileDiskSpace = aspxFiles.Sum(f => f.Length);


    ' VB - Get the list of files in the current folder
    Dim dirInfo As New DirectoryInfo(Path.GetDirectoryName(Request.PhysicalPath))
    Dim fInfo() As FileInfo = dirInfo.GetFiles()

    Dim fileCount = fInfo.Count()
    Dim totalDiskSpace = fInfo.Sum(Function(f) f.Length)
    Dim aspxFiles = fInfo.Where(Function(f) Path.GetExtension(f.Name.ToLower()) = ".aspx")
    Dim aspxFileCount = aspxFiles.Count()
    Dim aspxFileDiskSpace = aspxFiles.Sum(Function(f) f.Length)

    The above code starts be retrieving information about all of the files in the folder that the currently executing ASP.NET page resides in. It then uses the Count and Sum operators to get the number of files and the total file size. Note that the Sum method includes a selector method. The elements of the fInfo sequence are FileInfo objects. One of the properties of the FileInfo object is Length, which returns the size of the file in bytes. Therefore, we call the Sum operator and supply a method that returns the field to sum, namely Length.

    Next, the Where operator is used to get only those files that have the extension ".aspx". The Count and Sum operators are applied to this query to get the count and total file size of the .aspx pages in the folder.

    Conclusion


    LINQ includes a host of standard query operators, which are built-in operators that perform some calculation or modification to a sequence. The standard query operators can be broken down into various types, such as aggregation, conversion, element, grouping, joining, projection, and restriction types, among others. This article looked at a variety of standard query operators and showed them in action. The download available at the end of this article includes a handful of demos.

    The standard query operator examples in this article (and in the download) use the extension method syntax, such as: SequenceObject.Operator, or fib.Count(). An Introduction to LINQ noted that LINQ has a unique query syntax that allows you to use query operators in a SQL-like syntax. The next installment will explore LINQ's query syntax, which is what enables developers to write SQL-like queries in C# and Visual Basic syntax.

    Happy Programming!

  • By Scott Mitchell


    Attachments:


  • Download the code associated with this article series
  • Further Reading


  • Enumerable Class (technical docs)
  • LINQ's Standard Query Operators
  • The Standard LINQ Operators
  • 101 LINQ Samples
  • Standard Query Operators with LINQ
  • LINQ's Standard Query Operators (PDF)
  • A Multipart Series on LINQ
    This article is one in a series of articles on LINQ, which was introduced with .NET version 3.5.

  • An Introduction to LINQ - provides an overview of the purpose of LINQ, its design goals, and core components.
  • Extension Methods, Implicitly Typed Variables, and Object Initializers - looks at three language enhancements to VB and C# that, in part, allow for LINQ's unique syntax and functionality.
  • Lambda Expressions and Anonymous Types - explores two more language enhancements to VB and C# that permit LINQ's unique syntax and functionality.
  • The Ins and Outs of Query Operators - learn how query operators provide a universal approach to querying and modifying enumerable collections of data.
  • The Standard Query Operators - explore LINQ's standard query operators, a suite of built-in query operators for working with enumerable data.
  • Using the Query Syntax - learn how to write and use C# and Visual Basic's new query syntax, which lets you write LINQ queries using SQL-like syntax.
  • Grouping and Joining Data - examines the standard query operators and query syntax used to group and join data.
  • Introducing LINQ to XML - provides an overview of working with XML data using the LINQ to XML API.
  • Querying and Searching XML Documents Using LINQ to XML - examines querying and filtering XML documents using the LINQ to XML API.
  • Extending LINQ - Adding Query Operators - shows how to extend the functionality of LINQ by adding your own query operators.
  • (Subscribe to this Article Series! )



    ASP.NET [1.x] [2.0] | ASPMessageboard.com | ASPFAQs.com | Advertise | Feedback | Author an Article