When you think ASP, think...
Recent Articles
All Articles
ASP.NET Articles
ASPFAQs.com
Message Board
Related Web Technologies
User Tips!
Coding Tips

Sections:
Sample Chapters
Commonly Asked Message Board Questions
JavaScript Tutorials
MSDN Communities Hub
Official Docs
Security
Stump the SQL Guru!
XML Info
Information:
Feedback
Author an Article
Technology Jobs
ASP ASP.NET ASP FAQs Message Board Feedback ASP Jobs
Print this page.
Published: Wednesday, October 27, 2004

The Power of SQL CASE Statements

By Scott Mitchell


Introduction
As developers creating data-driven Web applications, we've all written more than our fair share of SQL statements. If I had a nickel for every SELECT statement or UPDATE statement I've written in my six years as a Web developer, I'd be a rich man. Despite the familiarity with SQL that many Web developers have, one of the lesser known SQL tools is the CASE statement. The CASE statement is a wily little fellow that can be added to both the SELECT portion of a SELECT statement as well as the ORDER BY portion. With CASE you can easily group data into various ranges, you can beautify the results of a SQL query, and can allow for dynamic sorting of the query's results. CASE statements can also be used to give subtitles to ROLLUP and CUBE queries, and can be used in computed columns to boot.

In this article we'll take a look at the powerful CASE statement, shining light on its utility and usefulness in everyday data-driven Web sites. Read on to learn more!

- continued -

'

CASE Basics
All programming languages contain conditional statements to allow program flow to be altered based on the result of some conditional statement, the simplest form of a conditional statement typically being an If statement. One downside of If statements is that if there are a number of different actions that must be taken based on some variable value, the number of If statements can explode, leading to bloated, unreadable code. For example, imagine that you wanted to do some different action based on the hour of the day. If a different action was performed for each hour of the day, you would end up having 24 If/ElseIf statements like so:

If Hour(Now()) = 0 Then
  'Do whatever needs to be done at midnight
ElseIf Hour(Now()) = 1 Then
  'Do whatever needs to be done at 1:00 am
ElseIf Hour(Now()) = 2 Then
  ...
...
End If

The multiple If statements detract from the readability of the code. Also, the maintainability of the code suffers. Imagine that you needed to change the code so that task x was run at both 3:00 am and 9:00 am. This would require picking your way through the If statements and adding additional conditional statements where needed.

Many programming languages provide case statements: Visual Basic offers a Select Case, C#/C/C++/Java the switch statement. These case statements allow for the myriad of If statements to be compressed into a more readable block of code. An example Select Case statement can be seen below:

Select Case Hour(Now())
  Case 0
    'Do whatever needs to be done at midnight
  Case 1
    'Do whatever needs to be done at 1:00 am
  Case 2
    ...
  ...
End Select

Case Statements are More Performant
In addition to improved readability, case statements also provided an opportunity for optimization. For a more detailed discussion on case statement optimizations check out Switch Statement Performance Consideration for a more thorough discussion.

Not to be outdone, T-SQL contains a CASE statement as well. The T-SQL CASE statement has two flavors:

  1. A simple case statement, whose syntax and semantics closely mirror those case statements found in other programming languages. The generic form for the simple case statement is:

    CASE expression
      WHEN value1 THEN result1
      WHEN value2 THEN result2
      ...
      WHEN valueN THEN resultN
      
      [
        ELSE elseResult
      ]
    END
    

  2. A "searched CASE form" that instead of specifying a single expression, contains a series of Boolean expressions in each of the WHEN clauses:

    CASE
      WHEN booleanExpression1 THEN result1
      WHEN booleanExpression2 THEN result2
      ...
      WHEN booleanExpressionN THEN resultN
      
      [
        ELSE elseResult
      ]
    END
    

Now that we've looked at the generic syntax for T-SQL CASE statements, let's turn our attention to examining how CASE can be used in real-world applications.

Prettying Up Your Output
The simplest use of CASE is for tweaking the output. For example, imagine you have a table with employee information that contains a char(1) field called Gender, with a value of M for males and F for females. You might want to have the output be Male and Female instead of M and F. This can easily be accomplished with CASE as the following example illustrates:

SELECT
  FirstName, LastName,
  Salary, DOB,
  CASE Gender
    WHEN 'M' THEN 'Male'
    WHEN 'F' THEN 'Female'
  END
FROM Employees

For more examples of enhancing the output with CASE be sure to read The Case for CASE.

Formatting Summation Columns
The SQL syntax offers a number of aggregate functions, such as AVG(), SUM(), MAX(), and so forth, allowing for averages, sums, maxes, minimums, and so on to be grouped by particular field values. Returning to the employees example, if each employee worked for a department specified by a Department field, and if each employee had a Salary field, you could display the average salary and total salary per department using a query like:

SELECT
  Department, AVG(Salary) as AvgSalary
FROM Employees
GROUP BY Department

The output of this query might look like:


Department       AvgSalary
Sales            78,500.00
Marketing        81,250.00
IT               55,000.50
Executive        91,900.75
You might also want to know the average salary for the entire company, across all departments. SQL provides the ROLLUP keyword to summarize aggregated data. For example, adjusting the query by adding WITH ROLLUP, as in:

SELECT
  Department, AVG(Salary) as AvgSalary
FROM Employees
GROUP BY Department
WITH ROLLUP

Might generate the following output:


Department       AvgSalary
Sales            78,500.00
Marketing        81,250.00
IT               55,000.50
Executive        91,900.75
NULL             76,662.81
The problem here is that the last row that provides the summary has a NULL value for the Department field. Ideally, this would have a value like Company Average. Well, SQL provides a GROUPING(columnName) function that returns a 1 when the specified column name is for a summarized row. Using this inside a CASE statement, you could achieve the desired output like so:

SELECT
  CASE 
    WHEN GROUPING(Department) = 1 THEN 'Company Average'
    ELSE Department
  END AS Department,
  AVG(Salary) as AvgSalary
FROM Employees
GROUP BY Department
WITH ROLLUP

For more on this technique, including how to display summarized data within a DataGrid, be sure to read Dave Long's article Summarizing Data with ROLLUP.

Used to Dynamically Sort Query Results
Ideally when creating data-driven Web applications, all SQL statements are encapsulated within stored procedures. When using an ASP.NET DataGrid, you can easily bind the results of a stored procedure to the DataGrid. However, if you want to implement sorting within the DataGrid, you need some means to dynamically order the stored procedure's results. There are a couple of techniques to accomplish this. A common one is to read the stored procedure results into a DataTable, and then use the DataTable's Sort property to sort the results by the specified column. If you want to stay away from DataTables and use just the more efficient DataReaders, you might pine for some means to pass in the column to sort by into the stored procedure. Unfortunately, the following stored procedure is not valid.

CREATE PROCEDURE getEmployees ( @OrderByClause varchar(100) ) AS

SELECT EmployeeID, FirstName, LastName, SSN, Salary
FROM Employees
ORDER BY @OrderByClause

What you can do, however, is use a CASE statement in the ORDER BY clause to specify what column to sort the results by based upon the passed in @OrderByClause parameter. The syntax looks like:

CREATE PROCEDURE getEmployees ( @ColumnName varchar(100) ) AS

SELECT EmployeeID, FirstName, LastName, SSN, Salary
FROM Employees
ORDER BY
  CASE WHEN @ColumnName='LastName' THEN LastName
       WHEN @ColumnName='Salary' THEN CONVERT(varchar(50), Salary)
       WHEN @ColumnName='SSN' THEN SSN
  END

Note that the input parameter is a varchar(100) and can specify one of the column names to sort by. The CASE statement in the ORDER BY clause determines what column to sort by based upon the value passed in.

There are a couple caveats when using this approach for dynamic ordering. First, dynamic ordering doesn't work for mixed data types. That is, if you have fields of differing data types that you want to order by - such as varchars and decimals, as is the case with LastName and Salary above - then in the CASE statement you need to use CONVERT to convert the mismatched data type to the same data type as the others. Note how in the second WHEN line in the example above, Salary is converted to a varchar(50). You'll have to do this with datetime, int and other non-character data types as well.

For a more in-depth look at this technique for dynamically sorting the results of stored procedures, be sure to read: Dynamic ORDER BY Statements in Stored Procedures.

Grouping Results Into Ranges
The CASE statement can also be used to group data into ranges. I recently helped a client create a system for sending invoices, and he needed a report that showed the total amount of money a customer was invoiced in the past 30 days, in the past 30-60 days, in the past 60-90 days, and 90+ days past. This was easily accomplished with a series of CASE statements within SUM() functions like so:

SELECT
  SUM(CASE 
         WHEN DateDiff(d, InvoiceDate, getdate()) BETWEEN 0 AND 29 THEN InvoiceAmount 
      END) as Current,
  SUM(CASE 
         WHEN DateDiff(d, InvoiceDate, getdate()) BETWEEN 30 AND 59 THEN InvoiceAmount 
      END) as ThirtyToSixty,
  SUM(CASE 
         WHEN DateDiff(d, InvoiceDate, getdate()) BETWEEN 60 AND 89 THEN InvoiceAmount 
      END) as SixtyToNinty,
  SUM(CASE 
         WHEN DateDiff(d, InvoiceDate, getdate()) >= 90 THEN InvoiceAmount 
      END) as NintyPlus
FROM Invoices      

Using CASE in Computed Columns
With SQL Server you can create computed columns that are comprised of data from other columns in the table. For example, in a table storing medical information for patients, there might be Weight and Height fields, with a computed column for the BMI. (BMI, or Body Mass Index, is a ratio of a person's height and weight, and is a figure that is commonly used to ascertain if a patient is underweight, normal weight, overweight, or obese.)

When creating a computed column you can use a CASE statement in the computation. For example, imagine that you had a table with information about classes offered by a local college. This table might have two int fields that recorded the number of students and professors participating in the class (call these fields Students and Professors). Now, you might want to add a computed field that returned the ratio of students to professors, providing the formula Students / Professors. The problem here is that if there is a class with no professors, the computed column will barf since division by 0 is not allowed. To surmount this problem, you could use a computed column formula with a CASE statement that returned NULL (or some predefined value, such as 0) if Professors equaled 0. That is, you'd have your formula become:

CASE
  WHEN Faculty = 0 THEN NULL
  ELSE Students / Faculty
END

For more information on this technique, check out this blog entry: Computed Columns and Divide By Zero Errors.

Conclusion
In this article we looked at SQL's CASE statement, a useful little bugger that, sadly, doesn't get the press he deserves. CASE statements can be used to format the output of your SQL queries, to group data into various ranges, to allow for dynamic ordering of SQL results in a stored procedure, used as conditionals in computed columns, and used to format summation columns.

Happy Programming!

  • By Scott Mitchell

  • Software Developer / Programmer - Distributed Systems (NYC)
    Next Step Systems
    US-NY-New York

    Justtechjobs.com Post A Job | Post A Resume


    ASP.NET [1.x] [2.0] | ASPMessageboard.com | ASPFAQs.com | Advertise | Feedback | Author an Article