When you think ASP, think...
Recent Articles
All Articles
ASP.NET Articles
ASPFAQs.com
Message Board
Related Web Technologies
User Tips!
Coding Tips

Sections:
Sample Chapters
Commonly Asked Message Board Questions
JavaScript Tutorials
MSDN Communities Hub
Official Docs
Security
Stump the SQL Guru!
XML Info
Information:
Feedback
Author an Article
ASP ASP.NET ASP FAQs Message Board Feedback
Print this page.
Published: Saturday, January 12, 2002

Using the Group By Clause

By Scott Mitchell


Introduction
While most developers are familiar with the basic SQL syntax for constructing SELECT statements, many folks have rarely, if ever, used one of the more powerful clauses of the SELECT statement: Group By. I think the reason of this lack of use of this clause is two-fold:

    - continued -

  1. You don't really need to ever use the Group By clause of the SELECT statement. You could simply do some Recordset processing in the ASP page to simulate these results.
  2. The Group By clause doesn't get much press in Web sites, books, etc.

This article examines what these two clauses do and presents a number of common queries that become rather simple with the use of Group By.

What Does Group By Do?
SQL Server, if you knew it or not, supports what are called aggregate functions. These functions - SUM, COUNT, AVG, MAX, MIN, etc. - can appear in the SELECT clause. For example, if we had an Employee database with the following tables:

Employee
EmployeeIDint, PK
Namevarchar(100)
DepartmentIDint, FK
Salarymoney
Department
DepartmentIDint, PK
Namevarchar(100)
Descriptiontext

Note that in this data model there would be a one to many relationship between Department.DepartmentID and Employee.DepartmentID; this foreign key identifies what department each employee belongs to (departments might have values such as Executive, IT, Sales, Marketing, Engineering, etc.).

Now, imagine that we wanted to view the average salary for our employees. We could accomplish this using the AVE aggregate function like so:

SELECT AVG(Salary) AS AverageSalary
FROM Employees

This would return a Recordset with one row and one column, this single cell containing the average salaries of our employees. In an ASP page you could access this value from the Recordset using: objRS("AverageSalary").

Now imagine that we wanted to get the average salary for each department. Intuitively we may decide to write our SQL query as follows:

SELECT DepartmentID, AVG(Salary) AS AverageSalary
FROM Employees

However, such a query will throw an error. This error is thrown because the SQL query processor can't retrive the value for a single row (DepartmentID) at the same time it is computing an aggregate value (AVG(Salary)) unless you use the Group By clause. Essentially, when you leave off the Group By clause the aggregate function is applies to all of the rows in the Employees table that match the WHERE clause conditions. If you wish to apply the aggregate function to subsets of the rows of the Employees table, though, you must provide information on how to partition the table. Specifically, we instruct SQL Server to partition up the results of the Employees table into sets determined by the value of DepartmentID.

Now that you're thoroughly confused, let's look at an example, to help clear things up. Imagine our Employees table had the following data:

Data from Employees Table
EmployeeIDNameDepartmentIDSalary
1Scott Mitchell175,000
2Tina Lee165,000
3Todd Maher5175,000
4Kevin Joy150,000
5James Miller345,000
6Justin Casey360,000

Now, if we apply the followign SQL statement:

SELECT AVG(Salary) AS AverageSalary
FROM Employees

We are asking SQL to find the average of all the salaries in the Employees table. (Of course we could limit this via a WHERE clause, such as finding the average salary for employees that made more than $50,000.) Essentially SQL iterates through our set of rows (the rows of Employees and sums up the salaries, dividing the by the total number of salaries summed up. This would result in a value of $78,333.33, the average salary.

Now imagine that we wanted to find the average salary of each department. That is, we want to find the average salary of those in department 1 (Scott, Tina, and Kevin), the average salary of those in department 3 (James and Justin) and the average salary of those in department 5 (Todd). To accomplish this, we need to have the set of rows returned by the Employees table partitioned by DepartmentID. By using the following SQL statement:

SELECT DepartmentID, AVG(Salary) AS AverageSalary
FROM Employees GROUP BY DepartmentID

We essentially break the Employees table into three smaller tables: one containing those who are in department 1; another containing those in department 3; and a third one containing those in department 5. So the returned results will each department listed along with the average salary in the department. The return result will be:

DepartmentIDAverageSalary
1$63,333.33
3$52,500.00
5$175,000.00

Summary
To summarize, the Group By clause partitions up a table by the columns specified; these partitions then have the aggregate function applied to each of them separately. The Group By clause can be very useful especially in data models where there are a number of related tables with one-to-many or many-to-many relationships between them. The Group By clause allows you to apply aggregate functions over partitions of these related tables. For more information on the Group By clause be sure to check out the SQL Books Online.

Happy Programming!

  • By Scott Mitchell


  • ASP.NET [1.x] [2.0] | ASPMessageboard.com | ASPFAQs.com | Advertise | Feedback | Author an Article