Efficiently Displaying Parent-Child Data
By Scott Mitchell
Introduction
Recently a friend approached me with a particular problem he was having. He had a database table
called Categories
which had a list of categories to various sections of his company's
intranet. The Categories
table had the following makeup:
Categories | ||
---|---|---|
CategoryID | int | The Primary Key |
Name | varchar(50) | The name of the category |
Link | varchar(100) | A URL to the category on the intranet |
ParentID | int | A Foreign Key on the Categories table |
The ParentID
serves as a self-referential foreign key. That is, each category may have
subcategories, which may have subcategories themselves, and so on. Those categories with no
parents would have a value of 0 for the ParentID
field.
For example, imagine that the intranet had three categories, "Payroll," "HR Info," and "Press Clippings." The "Payroll" might have the subcategory "Submit your Hours"; the "HR Info" category may have subcategories "Benefits," and "Latest News," with the "Benefits" category having subcategories "Insurance" and "401(k) Info"; finally, the "Press Clippings" category might have the subcategory "Stock Quotes".
Now, what my friend wanted to accomplish was to display the list of categories as hyperlinks in a hierarchical format. That is, on the left hand side of every page on the intranet, he wanted to ahve something like:
Payroll Submit Your Hours HR Info Benefits Insurance 401(k) Info Latest News Press Clippings Stock Quotes
Where each of the categories above linked to the appropriate intranet URL.
My Friend's Intial Approach
My friend's initial solution to this problem utilized the following steps:
- Issue a SQL query that gets all categories whose
ParentID
is 0. This will get all top level categories. Store these database results in a Recordset namedTopLevelRS
- Begin iterating through the
TopLevelRS
Recordset. For each record in the recordset, extract theCategoryID
value and issue a SQL statement that gets all records whoseParentID
equals the current record'sCategoryID
. Call this recordsetSecondLevelRS
- Begin iterating through the
SecondLevelRS
Recordset, applying the same logic as in Step (2).
This approach, as my friend found, was horrendously slow. The reason it was slow is because for the above category structure, involves eight separate SQL queries, which involves eight accesses to the database! On every page! In addition to its slow execution time, the script is limited in that it can only support subcategories up to two links deep. If one wants to add subcategories to the "Insurance" category, for instance, the code will need to be revisited.
Given this code and scenario, my friend approached me and asked if I had any advice on how to speed up the query. Having worked on customizing ASPMessageboard.com and starting the project that eventually became the ASP.NET Forums, I've had a bit of messageboard programming experience over the past few years. In threaded messageboards, there is often this exact scenario: displaying parent-child related data efficiently.
A Better Algorithm
An algorithm that requires only one database access and can display nested subcategories up to
any depth is possible. The reason the algorithm incurs only one database hit is because we read the
entire Categories
table into
a Recordset and then, using GetRows(),
convert the Recordset into a two-dimensional array. Once we have this array implementation of our data,
we can pass the array to the DisplayCategories
function (code shown below).
|
The way the DisplayCategories
function works is by a means called recursion. Essentially,
recusion occurs when a function calls itself. Generally speaking, the DisplayCategories
function
will print out all of the categories that have a ParentID
value equal to the input
parameter iCurID
, along with each such categories' subcategories.
Specifically, the DisplayCategories
function
works by iterating through the rows of the passed-in array (aCategories
). For each row,
it checks to see if the ParentID
column of the current row equals the value of the
iCurID
parameter. If it is, then we want to display this particular category and then
recursively display the categories whose ParentID
is the CategoryID
of the row we just emitted. This recursive step will allow for subcategories of up to any length to
be displayed in their proper order. (In computer scientist terms, this is known as a
"depth-first search.")
(For more information on recursion be sure to read: Recusrion: Why
It's Cool!)
Converting this Code to use in an ASP.NET Web Page |
---|
To convert this code to use in an ASP.NET Web page, you could populate a DataSet with the contents of
your SQL query and then programmatically iterate through its rows and columns. For example, to programmatically
iterate through a DataSet, you could do:
Dim myDataSet as DataSet '... Assume the DataSet is populated here ... Dim i as Integer For i = 0 to myDataSet.Tables(0).Rows.Count - 1 ' ... to access column "foo" from the current row, use: ' myDataSet.Tables(0).Rows(i)("foo") Next i |
Conclusion
The performance difference between the old and new algorithm is quite profound. The old algorithm
required several database accesses, and multiple Recordset object instantiations. Furthermore, it was
limited on the depth of subcategories. The new algorithm, however, requires only one database
access and can support subcategories up to any arbitrary level. To see the new algorithm is action
be sure to view the live demo!
Happy Programming!