Using Microsoft's Chart Controls In An ASP.NET Application: Adding Statistical FormulasBy Scott Mitchell
The Microsoft Chart controls make it easy to take data from a database or some other data store and present it as a chart. As discussed in Plotting Chart Data, the Chart controls offer a myriad of ways to get data into a chart. You can add the data programmatically, point-by-point, or you can bind an ADO.NET
DataTabledirectly to the Chart. You can even use declarative data source controls, like the SqlDataSource or ObjectDataSource controls.
In addition to converting your specified data points into a chart image, the Chart controls also include a wealth of statistical formulae that you can use to analyze the plotted data. For example, with a single line of code you determine the mean (average) value for data in a particular series. Likewise, with one line of code you can get the median, variance, or standard deviation. These values can be displayed as text on the page or as a stripe line on the chart itself. What's more, the Chart controls include functions to forecast future values, to compute moving averages, to identify trends, and to determine rates of change, among others.
This article looks at how to use two statistical formulae. Specifically, we'll look at how to compute and display the mean of a series, as well as how to display an exponential trend line on the chart to forecast future values. Read on to learn more!
Computing the Mean Value of a Series of Data Points
Given a list of numbers, the arithmetic mean is the sum of those numbers divided by the total quantity of numbers in the list. The mean is more colloquially referred to as the "average," and is a common metric used to analyze data.
A chart consists of one or more series, which are lists of data points. Typically, each data point is a pair of numbers that provide both the X value and Y value to be plotted. For some charts it may be helpful to know the average value of the data points plotted in a series. Consider a chart that shows gross sales by month for a particular product. In this case the X axis would list off the months (January, February, ...) while the Y axis would show the gross sales per month. In such a chart it might be helpful to know the mean (average) gross sales per month.
You could certainly compute the mean gross sales per month with a few lines of code that loop through the data points, sum up the values, and then divide by the count of data points. There's no need to write that code, however, because the Microsoft Chart controls provide a built-in function for computing the mean of a series:
The demo available for download at the end of this article includes a page named
MeanSales.aspx, which displays a line chart showing monthly sales for the
products in a specific category for a specific year along with the mean monthly sales. The mean monthly sales are displayed in two ways: as text and as a stripe line
on the chart itself. The screen shot below shows the monthly sales for those products in the Produce category for 1997. Note that the average sales per month - $4,809.88 - is
shown both as text on the page and as an orange line on the chart.
To create this chart I started by adding two DropDownList controls to the page named
is bound to a SqlDataSource control that populates it with the categories in the
Categories database table. The
ddlForYear contains three
- 1996, 1997, and 1998 - one for each year for which sales data exist. I also added a Label control (
lblAverageSales) to display the mean sales per month as text.
Next, I added the Chart control to the page named
chtCategorySales and bound it to a SqlDataSource control named
dsCategorySales. The SqlDataSource control
executes the following query, which returns each month with sales for products in a particular category for a particular year.
The Chart control contains a single line chart series named
SalesByMonth, which has its
YValueMembers properties set to
the names of the columns returned from the database query (
Total, respectively). The Chart control's declarative markup also defines a chart
MainChartArea) with some Y axis settings. Namely, I configured the Y axis to format its labels as currency values (with no decimal places) and to add an
orange stripe line. A stripe line is a line (or band) that can be added to the X or Y axis and used to show a particular value or range of values. In this case, I use the
stripe line to show the average sales per month. The position of the stripe line is based on its
IntervalOffset property, which is programmatically set to the mean value
(we'll explore this code momentarily).
The SqlDataSource plus the Chart's declarative markup is sufficient to display a chart of the gross sales per month for the selected category and year. In order to show the
mean sales per month we need to write a few lines of code. Keep in mind that we cannot calculate the mean until the series is populated with data. When using a Chart that
is bound to a data source control you know that the data has been plotted once the
DataBound event is raised. Therefore, I've created an event handler for this
event and placed the code to compute and display the mean there. (If you are adding the points programmatically rather than from a declarative data source control then you
would add this code immediately after the code that plots your data. The forecasting demo we'll look at shortly provides such an example.)
Here we calculate the mean, display it in the
lblAverageSales Label control, and assign it to the stripe line's
IntervalOffset property. The net effect
is that when the chart is plotted the mean sales per month is calculated and displayed both as text on the page and as a stripe line on the chart.
Using Forecasting Statistical Formulae
There are a variety of statistical formulae that can be used for forecasting. In brief, these formulae typically work by arriving at an equation whose line closely models the existing, known data points. This equation can then be extended to forecast future values. To apply a forecasting formula you must supply the following inputs:
- The forecasting parameters, which must be formatted into a comma-delimited string. There are four optional parameters:
- Regression Type - determines the type of equation to fit against the known points. You can specify a a numeric value to indicate the polynomial regression,
or you may use one of the following regression types:
- Period - the number of time units in the future to forecast. For example, to forecast three months in the future for monthly sales you would use a value of 3 here.
- Approximate Error - a Boolean value that indicates whether to output the approximation error.
- Forecast Error - a Boolean value that indicates whether to output the forecasting error.
- Regression Type - determines the type of equation to fit against the known points. You can specify a a numeric value to indicate the polynomial regression, or you may use one of the following regression types:
- The input series, namely the data for which the forecasts are being made.
- The output series. If you are outputting the approximate or forecasting errors then you will need to have a series for both the forecast and the errors.
The demo available for download at the end of this article includes a page named
Trendline.aspx, which uses an exponential regression to predict sales figures
three months into the future. The following screen shot shows the sales figures for Dairy Products sales in 1997 and the forecast for the first quarter of 1998.
Things are looking up!
Like the mean sales demo, to create the
Trendline.aspx I started by adding and configuring two DropDownList controls to the page to capture the category and year
used to power the report. I then added the Chart control to the page. Rather than use a SqlDataSource control to declaratively bind the data to the chart, I instead decided
to bind the data to the chart programmatically. (Using a SqlDataSource here certainly would have worked just fine.) Before we look at the code, let's take a peak at the
Chart control's declarative markup. Recall that in the previous example we added a stripe line to the Y axis to show the mean sales per month. We don't need a stripe line for
this chart, but we do need to define two series - one for the sales per month (
SalesByMonth) and another for the forecasting trend line (
If you want to include the approximate error or forecasting error you'd need to include series for these (most likely Range series).
The sales per month data is actually added to the chart from code in the
Page_Load event handler. Specifically, a connection is made to the database, an ad-hoc
query is executed that gets the sales data, and the results are enumerated, with each record returned from the database added to the
After the data has been plotted, an Exponential regression is used to add the forecast data to the
An abbreviated version of the
Page_Load event handler code follows:
The code that actually queries the database and plots the points one at a time has been removed for brevity. Download the demo to see the full code; refer to Plotting Chart Data for detailed examples on adding chart data point-by-point.
The red code in the snippet above adds the forecasting trend line to the chart's
TrendLine series, forecasting sales per month
three months into the future.
In addition to merely displaying plotted data, the Microsoft Chart controls include functionality for analyzing that data using a variety of statistical formulae. This article explored how to use this functionality to compute the mean (average) value in a series and how to add a trend line to forecast future results. These two examples just scratch the surface of the statistical capabilities of the Chart control - there are many more formulae available, including functions to compute moving and weighted moving averages, ranges, statistical distributions, and more.