Retrieving the First N Records from a SQL Query
By Scott Mitchell
| Ranking Results in SQL Server 2005 |
|---|
Microsoft's SQL Server 2005 introduces a number of new T-SQL ranking functions, ROW_NUMBER, RANK, DENSE_RANK, and NTILE.
These ranking functions are discussed in detail in Returning Ranked Results with Microsoft SQL Server 2005.
|
Introduction
There are often times when retrieving results from a database that you only are interested in displaying the first
N records from the resulting query. For example, a news website might have a database table that contains
a record for each story. On the news website's homepage, they may want to display the 10 most recent stories from
this table. Doing a SELECT * FROM NewsStoriesTable SQL query will return all of the records from
that table, no just the 10 most recent ones. So how do they display just the 10 most recent?
There are a couple ways to display the first N records from a query. The most naive way is to simply return
all of the records to the web page and then use programmatic logic to only show the first N. This is a poor approach
because it requires that the database query all records from the table and transfer all of the data from the database to
the web page when only a small subset of the total results will even be used. This approach's performance will worsen as
the NewsStoriesTable grows over time.
A better approach is to use SQL Server's ROWCOUNT and/or TOP keywords. These two keywords actually
limit the results SQL server queries and returns, thus reducing the burden both on retrieving the results from the database
and returning them to the web page. In this article we'll look at both ROWCOUNT and TOP and see
when to use each one. Read on to learn more!
Limiting Records with TOP
The TOP keyword was introduced with SQL Server 7.0 and provides a mean to limiting the results of a SQL SELECT query.
With TOP you can specify to retrieve the first N records or the first X percent of records.
The syntax for TOP is as follows:
|
Here N and X must be positive integer values; if you are using the second pattern, X must be between 0 and 100, inclusive.
Using TOP will access just the first N or X percent of records. Commonly such queries are accompanied
by ORDER BY clauses. For example, in the example discussed earlier with a news website, to return the 10 most
recent news articles you'd need to use an ORDER BY clause to return the results sorted by the
PublicationDate field in descending order. From this, you'd want just the top 10 results:
|
Limiting Records with ROWCOUNT
Prior to SQL Server 7.0, the only way to limit results from a SQL query was to use the ROWCOUNT variable. With
ROWCOUNT you'd proceed the actual SQL statement with a line like:
SET ROWCOUNT N
|
After which you could have your SQL statement. By setting ROWCOUNT to some number greater than zero, all
subsequent SQL statements in the scope would process only the first N records.
ROWCOUNT still works with modern versions of SQL Server, and does have its time and place. Keep in mind, though,
that ROWCOUNT applies not only to SELECT statements, but also to INSERT, UPDATE,
and DELETE statements as well. Additionally, ROWCOUNT's effects apply to triggers that may be
firing as well.
Here's a simple example of using ROWCOUNT to retrieve the 10 most recent articles from the NewsStoriesTable
table:
|
When to Use TOP vs. When to Use ROWCOUNT
So what approach should you use, TOP or ROWCOUNT? If you are using SQL Server 6.5 or earlier, you'll
have to use ROWCOUNT, as TOP wasn't introduced until SQL Server version 7.0. Of course, unless you
are supporting a legacy system, chances are you are using at least SQL Server 7.0, if not SQL Server 2000 or SQL Server 2005.
In SQL Server 2005, TOP can do everything ROWCOUNT can. The TOP value can be a parameter,
it can be applied to INSERT, UPDATE, and DELETE statements. In his book
SQL Server 2005 T-SQL Recipes, Joseph Sack
advises:
"In previous versions of SQL Server, developers usedSET ROWCOUNTto limit how many rows the query would return or impact. In SQL Server 2005, you should use theTOPkeyword instead ofSET ROWCOUNTas theTOPwill usually perform faster."
For some examples of using TOP in SQL Server 2005, see "TOP"
Clause in SQL Server 2005.
In SQL Server 7.0 and 2000, TOP does not have all the functionality SET ROWCOUNT offers
(i.e., applying to non-SELECT statements, effecting triggers, etc.). But TOP works in a more
straightforward manner, in my opinion. For example, with simple SELECTs
you won't have any issues using SET ROWCOUNT, but if you are calling a stored procedure that does a SELECT
but also processes other statements, with SET ROWCOUNT
you'll need to be more careful to ensure that your SET ROWCOUNT N statement applies only to those queries
intended.
With SQL Server 7.0 and 2000 there is one case where ROWCOUNT may be preferred in a SELECT query: when
you want to let the web page
developer specify how many results to return. For example, imagine you have a stored procedure named getRecentArticles
that takes in as an integer input parameter named @ResultCount. Now, the stored procedure should return
only the @ResultCount most recent articles (rather than always returning the 10 most recent). This allows
the page developer to indicate if she wants to get back the 10 most recent articles or, say, the 25 most recent articles.
Say we have the following:
|
This can be accomplished in SQL Server 7.0 and SQL Server 2000 using TOP only if you use dynamic SQL in your
stored procedure. That is, you cannot do the following:
|
Instead you will need to dynamically craft a string in the stored procedure and then execute it using
EXEC or sp_executesql.
This approach, while possible, tends to lead to messy and less maintainable SQL stored procedures, in my opinion. (If you are interested in learning
more on using dynamic SQL in your stored procedures be sure to read my earlier article, Using Dynamic
SQL Statements in Stored Procedures.)
With ROWCOUNT, however, you can just slap in the @ResultCount input parameter like so:
|
Finally, if you are wanting to limit the results performed by an INSERT, UPDATE, or DELETE,
you'll have to use ROWCOUNT as TOP only applies to SELECT statements.
Conclusion
In this article we examined how to return the first N records from a SQL query. Specifically we looked at two
SQL Server keywords, TOP and ROWCOUNT, examining sample SQL syntax for each. Finally, we compared
and contrasted the two, looking at when to TOP versus when to use ROWCOUNT.
Happy Programming!




