When you think ASP, think...
Recent Articles
All Articles
ASP.NET Articles
ASPFAQs.com
Message Board
Related Web Technologies
User Tips!
Coding Tips
Search

Sections:
Book Reviews
Sample Chapters
Commonly Asked Message Board Questions
JavaScript Tutorials
MSDN Communities Hub
Official Docs
Security
Stump the SQL Guru!
Web Hosts
XML
Information:
Advertise
Feedback
Author an Article

ASP ASP.NET ASP FAQs Message Board Feedback
 
Print this Page!
Published: Wednesday, February 11, 2009

An Extensive Examination of LINQ: An Introduction to LINQ

By Scott Mitchell


A Multipart Series on LINQ
This article is one in a series of articles on LINQ, which was introduced with .NET version 3.5.

  • An Introduction to LINQ - provides an overview of the purpose of LINQ, its design goals, and core components.
  • Extension Methods, Implicitly Typed Variables, and Object Initializers - looks at three language enhancements to VB and C# that, in part, allow for LINQ's unique syntax and functionality.
  • Lambda Expressions and Anonymous Types - explores two more language enhancements to VB and C# that permit LINQ's unique syntax and functionality.
  • The Ins and Outs of Query Operators - learn how query operators provide a universal approach to querying and modifying enumerable collections of data.
  • The Standard Query Operators - explore LINQ's standard query operators, a suite of built-in query operators for working with enumerable data.
  • Using the Query Syntax - learn how to write and use C# and Visual Basic's new query syntax, which lets you write LINQ queries using SQL-like syntax.
  • Grouping and Joining Data - examines the standard query operators and query syntax used to group and join data.
  • Introducing LINQ to XML - provides an overview of working with XML data using the LINQ to XML API.
  • Querying and Searching XML Documents Using LINQ to XML - examines querying and filtering XML documents using the LINQ to XML API.
  • Extending LINQ - Adding Query Operators - shows how to extend the functionality of LINQ by adding your own query operators.
  • (Subscribe to this Article Series! )

    Introduction


    LINQ, or Language INtegrated Query, is set of classes added to the .NET Framework 3.5 along with language enhancements added to C# 3.0 and Visual Basic 9, the versions of the language that ship with Visual Studio 2008. LINQ adds a rich, standardized query syntax as a first-class citizen in .NET programming languages that allows developers to interact with any type of data.

    Consider a typical data-driven application. There may be times when you are working with a database, displaying records or editing, inserting, and deleting data. Certain parts of the application may require retrieving certain elements from an XML file, or constructing an XML file based on user input. Or perhaps you have a collection of objects returned from a business object that you now want to work with by sorting them, computing the average value of a particular numeric property value, and displaying only those objects that meet a specified criteria. Prior to LINQ, working with each data source requires writing a different style of code. Moreover, working with external resources like data bases, XML files, and the like typically involves communicating with that external resource in some syntax specific to that resource. To retrieve data from a database you need to send it a string that contains the SQL query to execute; likewise, to work with a subset of XML elements in an XML document involves specifying an XPath expression in the form of a string. The idea is that using LINQ you can work with disparate data sources using a similar style without having to know a separate syntax for communicating with the data source (e.g., SQL or XPath) and without having to resort to passing opaque strings to external resources.

    This article is the first in a series of articles that explores the goals of LINQ, its underpinnings, its syntax, and LINQ providers like LINQ to Objects, LINQ to XML, LINQ to SQL, and so forth. This inaugural article offers an overview of LINQ, looks at some simple examples of using the LINQ classes and syntax, and examines the core LINQ classes in the .NET Framework. Read on to learn more!

    - continued -

    The Case for LINQ


    Many applications use an external resource in some form or another, the most common one being a database. Because of the physical and logical separation between the runtime executing a program and the external resource, there is bound to be a number of extra steps that the developer working with the resource has to perform. What's more, the information passed to the external resource and the information received from the external resource usually must undergo some transformation. The extra work involved in communicating with an external resource is best seen by an example. Imagine that you were working on a data-driven web application and were in the midst of building the Data Access Layer (DAL), working on a routine that sent a query to the database, populated the results in a collection of business objects, and returned this collection. The code for this method might look like the following:

    // C#
    public List<Employee> GetEmployeesInDepartment(int departmentId)
    {
       // Connect to database
       SqlConnection myConnection = new SqlConnection(connectionString);
       myConnection.Open();
       
       // Issue query
       const string sql = "SELECT * FROM Employees WHERE DepartmentID = @DepartmentID";
       SqlCommand myCommand = new SqlCommand(sql, myConnection);
       myCommand.Parameters.AddWithValue("@DepartmentID", departmentId);
       
       // Get reader back
       SqlDataReader myReader = myCommand.ExecuteReader();
       
       // Populate list of Employee objects
       List<Employee> emps = new List<Employee>();
       while (myReader.Read())
       {
          Employee emp = Employee.Populate(myReader);
          emps.Add(emp);
       }
       
       myReader.Close();   // Clean up
       myConnection.Close();

       return emps;   // Return employees
    }


    ' VB
    Public Function GetEmployeesInDepartment(ByVal departmentId As Integer) As List(Of Employee)
       ' Connect to database
       Dim myConnection As New SqlConnection(connectionString)
       myConnection.Open()
       
       ' Issue query
       Const sql As String = "SELECT * FROM Employees WHERE DepartmentID = @DepartmentID"
       Dim myCommand As New SqlCommand(sql, myConnection)
       myCommand.Parameters.AddWithValue("@DepartmentID", departmentId)
       
       ' Get reader back
       Dim myReader As SqlDataReader = myCommand.ExecuteReader()
       
       ' Populate list of Employee objects
       Dim emps As New List(Of Employee)
       While myReader.Read()
          Dim emp As Employee = Employee.Populate(myReader)
          emps.Add(emp)
       End While
       
       myReader.Close()   ' Clean up
       myConnection.Close()

       return emps   ' Return employees
    End Function

    In order to send a query to the database we must first establish a connection to the database. We then must encode the logic - the SQL query, its parameters, and the parameters' values - into strings that are supplied to the SqlCommand object. And because these inputs are encoded into opaque data (strings, for instance), there is no compile-time error checking and very limited debugging support. For example, if there's a typo in the SELECT query causing the Employees table name to be misspelled, this typo won't propagate until runtime when this page is visited. (And typos are easy to make seeing as there's no IntelliSense support.) Ideally, Visual Studio would display an error message alerting us to this incorrect table name when building the application. Another mismatch between the programming language and the database is that the data returned by the database is transformed for us into objects accessible through the SqlDataReader, but these objects are not strongly-typed objects like we'd like. To get this data into strongly-typed objects we must write code ourselves that enumerates the database results and populates each record into a corresponding object.

    LINQ was designed to address the issues illustrated by the example above. LINQ aims to offer a unified syntax for working with data, be it data from a database, an XML file, or a collection of objects. With LINQ you don't need to know the intricacies of SQL, the ins and outs of XPath, or various ways to work with a collection of objects. All you need be familiar with is LINQ's classes and the associated language enhancements centered around LINQ. This leads into another design goal of LINQ: to add first-class constructs to C# and Visual Basic that allows for SQL SELECT-like syntax for querying any data source. In other words, LINQ aims to move the SELECT statement out of an opaque string and into keywords in the language, a move that allows for type safety, IntelliSense support, compile-time error checking, and enhanced debugging scenarios.

    How Is LINQ Implemented?


    LINQ was introduced in the .NET Framework 3.5 (the Visual Studio 2008 cycle) and is composed of three main components:
    • Standard Query Operators - a set of extension methods in the .NET Framework that can be used to work with any collection of objects that implements the IEnumerable<T> interface. A class that implements the IEnumerable<T> interface must provide an enumerator for iterating over a collection of a specific type (T). All arrays inherently implement IEnumerable<T>, as do most of the built-in collection objects like List<T>, Dictionary<K,T>, and so forth. Using these operators you can: filter the results; perform aggregate operations like sum, min, max, and average; join two collections based on matching keys; order the results; group the results; determine the total number of elements in the collection; and so forth.

      Specifically, these extension methods are defined in the System.Core.dll assembly in the System.Linq namespace. We'll look at an example of using the standard query operators later on in this article.

    • Language Extensions - to make the standard query operators easier to use, and to offer a more SQL-like syntax in C# and Visual Basic, Microsoft added a number of new extensions to C# 3.0 and Visual Basic 9. These extensions include implicitly typed variables, anonymous types, object initializers, and lambda expressions; each extension will be explored in detail in future installments. What's important to understand is that these extensions are simply syntactic sugar. Behind the scenes, the compiler converts the syntax made possible by these enhancements into calls to the standard query operators. We'll look at an example of using the language extensions later on in this article.

    • LINQ Providers - it is possible to create a class known as a LINQ Provider that takes a LINQ query, examines it, and dynamically generates a method that executes an equivalent query against a specific data source. The .NET Framework ships with four LINQ Providers: LINQ to Objects, which executes a LINQ query against a collection of objects; LINQ to XML, for querying XML documents; LINQ to SQL, which allows LINQ queries to operate against a Microsoft SQL Server database; and LINQ to DataSets, which execute LINQ queries against ADO.NET DataSets.

      In addition to these three providers there are other LINQ Providers available. Microsoft has created a LINQ Provider that operates against its Entity Framework, for example, as well as one to operate against the ADO.NET Data Services. And many open-source projects and third-party companies that offer some sort of data store or middle-tier library for working with data have a LINQ Provider so that LINQ queries can be executed against their data store or against their middle tier implementation. For example, there's a LINQ Provider for NHibernate, an open-source Object/Relational Mapping (O/RM) tool.

    Using the Standard Query Operators Against a Collection of Objects


    The standard query operators are implemented as a number of extension methods on the IEnumerable<T> interface. That means that if we have an object that implements IEnumerable<T> at our disposal we can use the variety of standard query operators to work with that collection. As noted earlier, all arrays in .NET implement IEnumerable<T>. Therefore, let's take an array and practice using these standard query operators.

    Let's start with a simple example. (All of the LINQ examples examined in this tutorial are provided in both C# and VB code and are available for download at the end of this article.) The following code creates an array that contains the first nine Fibonacci numbers. Two of the standard query operators are then used: Count() and Average(), which return the number of elements in the collection and the average value of the elements in the collection, respectively. These values are then displayed in a Label.

    // C#: Create an array of integers
    int[] fibNum = {1, 1, 2, 3, 5, 8, 13, 21, 34};

    // Use the Count Standard Query Operator to determine how many elements are in the collection
    int totalNumberOfElements = fibNum.Count();

    // Use the Average Standard Query Operator to determine the average value
    double averageValue = fibNum.Average();

    // Output the values...
    Results.Text = String.Format("The first {0} elements of Fibonacci sequence have an average value of {1:N2}!", totalNumberOfElements, averageValue);


    ' VB: Create an array of integers
    Dim fibNum() As Integer = {1, 1, 2, 3, 5, 8, 13, 21, 34}

    'Use the Count Standard Query Operator to determine how many elements are in the collection
    Dim totalNumberOfElements As Integer = fibNum.Count()

    'Use the Average Standard Query Operator to determine the average value
    Dim averageValue As Double = fibNum.Average()

    'Output the values...
    Results.Text = String.Format("The first {0} elements of Fibonacci sequence have an average value of {1:N2}!", totalNumberOfElements, averageValue)

    The page, when visited through a browser, displays the output: "The first 9 elements of Fibonacci sequence have an average value of 9.78!"

    As evidenced by the example above, the Count() and Average() standard query operators do not require any input parameters. Other operators, such as the Where operator, require an input; in the case of the Where operator you must supply information as to how the data is to be filtered. But what sort of input parameter would a Where operator require? Imagine if you were writing a function that was supposed to filter a collection of objects based on a filtering condition supplied through an input parameter to the function. How would you write that function given that you don't even know what type of collection of objects you are going to be filtering in the first place!?

    In short, the Where operator must accept a function as input. The Where operator will then call the passed-in function, passing it each element in the collection of objects, and asking that function, "Should this item be filtered out of the collection?" Similarly, many other standard query operators require that a function be passed in as an input parameter. For example, the OrderBy operator must be passed a function that indicates what field each object in the collection is to be sorted by. It's a complex concept to wrap your head around at first.

    .NET has long allowed developers to reference functions as a variable of sorts and to pass that variable to methods. Doing so involves creating the function as you normally would and then creating a delegate that references the function. This delegate can then be passed around and used to invoke the function it points to. Some of the language extensions added in C# 3.0 and Visual Basic 9 were added specifically to make it possible to tersely create a function so that it can be called or passed into another method in just one line of code. (As noted earlier, we'll explore these language extensions in greater detail in future installments.) You can see this new syntax in action in the following example, which uses the Where operator to filter the first nine Fibonacci numbers to compute the average of only the odd numbers.

    // C#: Create an array of integers
    int[] fibNum = { 1, 1, 2, 3, 5, 8, 13, 21, 34 };

    // Use the Count Standard Query Operator to determine how many elements are in the collection
    int totalNumberOfElements = fibNum.Count();

    // Use the Where operator to get the odd Fibonacci numbers and average those
    double averageValue = fibNum.Where(num => num % 2 == 1).Average();

    // Output the values...
    Results.Text = String.Format("Of the first {0} elements of Fibonacci sequence the odd numbers have an average value of {1:N2}!", totalNumberOfElements, averageValue);


    'VB: Create an array of integers
    Dim fibNum() As Integer = {1, 1, 2, 3, 5, 8, 13, 21, 34}

    'Use the Count Standard Query Operator to determine how many elements are in the collection
    Dim totalNumberOfElements As Integer = fibNum.Count()

    'Use the Where operator to get the odd Fibonacci numbers and average those
    Dim averageValue As Double = fibNum.Where(Function(num) num Mod 2 = 1).Average()

    'Output the values...
    Results.Text = String.Format("Of the first {0} elements of Fibonacci sequence the odd numbers have an average value of {1:N2}!", totalNumberOfElements, averageValue)

    The output of the code, when viewed through a browser, is: "Of the first 9 elements of Fibonacci sequence the odd numbers have an average value of 7.33!"

    Most of the code is the same as the first example, but on the line where the average is computed I've added a call to the Where operator. The Where operator expects a function that takes as input a type of the collection being filtered and returns a Boolean, indicating whether the element belongs in the return set. In short, the function passed into the Where operator returns true if the number being enumerated MOD 2 equals 1. The operation X MOD Y returns the remainder of X / Y, so X MOD 2 returns 0 if X is even and 1 if X is odd.

    Also note how the standard query operators can be stringed along, one after the other. I have: fibNum.Where(function).Average(), which first applies the Where operator to the fibNum array and then takes the resulting filtered set and applies the Average operator to that.

    Using the Language Extensions to Write More SQL-Like Queries


    In addition to the standard query operators, Microsoft added a host of language extensions to C# and Visual Basic to allow for a more SQL-like syntax in working with LINQ. We'll explore this new syntax in a future article. For now I want to just show the syntax so you can see and appreciate how LINQ does truly offer SQL-like syntax in C# and Visual Basic. The following example is the same as the last one - it takes the first nine Fibonacci numbers and computes the average of the odd numbers - but does so using the language extensions rather than the extension methods.

    // C#: Create an array of integers
    int[] fibNum = { 1, 1, 2, 3, 5, 8, 13, 21, 34 };

    // Use the Count Standard Query Operator to determine how many elements are in the collection
    int totalNumberOfElements = fibNum.Count();

    // Use the Where operator to get the odd Fibonacci numbers and average those
    double averageValue =
       (from num in fibNum
       where num % 2 == 1
       select num).Average()
    ;

    // Output the values...
    Results.Text = String.Format("Of the first {0} elements of Fibonacci sequence the odd numbers have an average value of {1:N2}!", totalNumberOfElements, averageValue);


    'VB: Create an array of integers
    Dim fibNum() As Integer = {1, 1, 2, 3, 5, 8, 13, 21, 34}

    'Use the Count Standard Query Operator to determine how many elements are in the collection
    Dim totalNumberOfElements As Integer = fibNum.Count()

    'Compute the average of the odd Fibonacci numbers
    Dim averageValue As Double = _
          Aggregate num In fibNum _
          Where num Mod 2 = 1 _
          Into Average()


    'Output the values...
    Results.Text = String.Format("Of the first {0} elements of Fibonacci sequence the odd numbers have an average value of {1:N2}!", totalNumberOfElements, averageValue)

    These language extensions offer full IntelliSense support, compile-time type and syntax checking, and debugging capabilities. As you can probably surmise by inspecting the code, the From statement enumerates a specified collection of objects (fibNum, in this case) and can have other operators applied to it, such as Where and Select and Average.

    Conclusion


    This article is the first in a series of articles on LINQ, its syntax, its uses, and LINQ Providers, like LINQ to XML and LINQ to SQL. In this installment we looked at the motivation for LINQ and its key design goals. We then talked about the three main cornerstones of LINQ - the standard query operators, the language extensions, and LINQ providers. This was followed by a look at some simple examples using the standard query operators and language extensions.

    We've only begun to scratch at the surface of LINQ, and I'm sure that this article has raised many more questions than it has answered. In the upcoming tutorials we will explore the C# and Visual Basic language enhancements that make LINQ possible: extension methods, anonymous types, lambda expressions, and so forth. Once we have a solid understanding of these concepts we'll be ready to delve into how to use LINQ to query databases, XML files, and other data stores.

    Happy Programming!

  • By Scott Mitchell


    Attachments:


  • Download the code associated with this article series
  • Further Reading


  • LINQ: .NET Language-Integrated Query
  • LINQ Introduction
  • LINQ in Action (a book by Fabrice Marguerie, Steve Eichert, and Jim Wooley)
  • A Multipart Series on LINQ
    This article is one in a series of articles on LINQ, which was introduced with .NET version 3.5.

  • An Introduction to LINQ - provides an overview of the purpose of LINQ, its design goals, and core components.
  • Extension Methods, Implicitly Typed Variables, and Object Initializers - looks at three language enhancements to VB and C# that, in part, allow for LINQ's unique syntax and functionality.
  • Lambda Expressions and Anonymous Types - explores two more language enhancements to VB and C# that permit LINQ's unique syntax and functionality.
  • The Ins and Outs of Query Operators - learn how query operators provide a universal approach to querying and modifying enumerable collections of data.
  • The Standard Query Operators - explore LINQ's standard query operators, a suite of built-in query operators for working with enumerable data.
  • Using the Query Syntax - learn how to write and use C# and Visual Basic's new query syntax, which lets you write LINQ queries using SQL-like syntax.
  • Grouping and Joining Data - examines the standard query operators and query syntax used to group and join data.
  • Introducing LINQ to XML - provides an overview of working with XML data using the LINQ to XML API.
  • Querying and Searching XML Documents Using LINQ to XML - examines querying and filtering XML documents using the LINQ to XML API.
  • Extending LINQ - Adding Query Operators - shows how to extend the functionality of LINQ by adding your own query operators.
  • (Subscribe to this Article Series! )



    ASP.NET [1.x] [2.0] | ASPMessageboard.com | ASPFAQs.com | Advertise | Feedback | Author an Article