When you think ASP, think...
Recent Articles
All Articles
ASP.NET Articles
ASPFAQs.com
Message Board
Related Web Technologies
User Tips!
Coding Tips
Search

Sections:
Book Reviews
Sample Chapters
Commonly Asked Message Board Questions
JavaScript Tutorials
MSDN Communities Hub
Official Docs
Security
Stump the SQL Guru!
Web Hosts
XML
Information:
Advertise
Feedback
Author an Article

ASP ASP.NET ASP FAQs Message Board Feedback
 
Print this Page!
Published: Wednesday, October 21, 2009

Using SqlBulkCopy To Perform Efficient Bulk SQL Operations

By Scott Mitchell


Introduction


Over the years I've worked on a number of projects that have shared a common requirement - the ability for users to quickly import large amounts of data into a back end SQL Server database. One such project was a web application used by teachers and other staff members. The software was initially purchased for just two schools in the district, but was soon expanded to encompass other schools. Every few months one or two new schools were brought into the fold; every time a new school was added an administrative user would have to create accounts for the new teachers and staff members so that they could sign into the site. Initially, the application offered a web page for the administrator to create new user accounts one at a time, but this interface quickly became tedious and impractical once larger schools with upwards of 100 users were brought online.

To allow school administrators to quickly import new users we created a web page from which a user could upload an Excel spreadsheet that contained the one row for every new user; the columns in the spreadsheet mapped to table columns in the database. After uploading this spreadsheet, the application would walk through each row and insert a record into the table. Through this mechanism an administrator could create the 100+ user accounts by first building an Excel spreadsheet (something many of the schools already had on file) and then upload that spreadsheet. This technique may sound familiar - in Importing an Excel Spreadsheet Using Typed DataSets and TableAdapters, author Nannette Thacker walked through building such an interface.

While the described approach works well when importing hundreds of records, it starts to seriously slow down when importing thousands or tens of thousands of records. The slowdown is due to the fact that each imported record sends its own INSERT statement to the database. This results in a lot of "chatter" between the web server and the database server. If you are importing data to Microsoft SQL Server the good news is that this process can be dramatically sped up using ADO.NET's SqlBulkCopy class. In my testing, importing 10,000 records using the one INSERT statement per import record took more than three seconds to complete; using SqlBulkCopy took a fraction of a second.

This article looks at how to use the SqlBulkCopy class to efficiently execute bulk operations against Microsoft SQL Server. Read on to learn more!

- continued -

An Overview of SqlBulkCopy and the Demo Application


SqlBulkCopy is a class that was added to the .NET Framework version 2.0 (in the System.Data.SqlClient namespace). It provides an efficient means to import data into a Microsoft SQL Server database. Before using the SqlBulkCopy class, you first need to get the data to import into Microsoft SQL Server. The import data can come from virtually any data source, as SqlBulkCopy can import data from a DataReader, DataRow, or DataTable. Once the import data is ready, you create an instance of the SqlBulkCopy class and specify the destination database and the table to import the data into. You can optionally specify column mappings - indicating what columns in the source data are mapped to what columns in the destination table - along with other options. Finally, to perform the import, call the SqlBulkCopy object's WriteToServer method, passing in the data to import.

The download available at the end of this article includes a demo that imports data from an uploaded Excel spreadsheet into a table in a Microsoft SQL Server database. The demo aims to mimic an inventory tracking web application that offers users the ability to import data into the system by uploading an Excel spreadsheet. The SQL Server database, which is located in the App_Data folder, contains a single table named InventoryItems with the following schema:

ColumnTypeNotes
InventoryItemIDintPrimary key; IDENTITY column
ItemIDvarchar(50) 
ItemNamenvarchar(100) 
UnitCostmoney 
UnitPricemoney 
DateAddeddatetimeHas a default value: getdate()

The Excel spreadsheet includes four columns that specify the values that will be imported into the ItemID, ItemName, UnitCost, and UnitPrice columns; the InventoryItemID and DateAdded values will be automatically assigned by the database.

The Excel spreadsheet contains four columns.

Preparing the Excel Spreadsheet Data


The demo available for download includes a web page (Default.aspx) from which a user can upload an Excel spreadsheet. On upload, the Excel spreadsheet is saved to the web server's file system. Next, the spreadsheet's contents are read into a DataTable named excelData, and this DataTable is what will eventually be passed into the SqlBulkCopy object's WriteToServer method to perform the import.

The following code shows how to connect to the Excel spreadsheet and read its contents into a DataTable. The variable uploadFileName contains the physical path to the location where the just-uploaded Excel spreadsheet was saved on the web server's file system. For more information on querying Excel spreadsheets from .NET code, see Importing an Excel Spreadsheet Using Typed DataSets and TableAdapters.

Dim connectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
   "Data Source=" & uploadFileName & ";" & _
   "Extended Properties=Excel 8.0;"

Dim excelData As New DataTable

Using myConnection As New OleDbConnection(connectionString)
   'Get all data from the InventoryData worksheet
   Dim myCommand As New OleDbCommand
   myCommand.CommandText = "SELECT * FROM [InventoryData$]"
   myCommand.Connection = myConnection

   'Read data into DataTable
   Dim myAdapter As New OleDbDataAdapter
   myAdapter.SelectCommand = myCommand
   myAdapter.Fill(excelData)

   myConnection.Close()
End Using

Note: The connection string used in this example works for Excel spreadsheets saved in the Excel 97-2003 format (.xls). See Connection Strings for Excel 2007 to learn how to connect to an Excel 2007 document.

Creating, Configuring, and Importing Data With The SqlBulkCopy Object


To use the SqlBulkCopy class you need to supply it information about the database connection to use. This can be in the form of a SqlConnection object or as a connection string. You also need to specify the name of the destination database table. Once these required bits of information have been set, you can import the data by calling the WriteToServer method and passing in the source data.

The following snippet shows how to import data using the SqlBulkCopy object when using an explicit SqlConnection object:

Using destinationConnection As New SqlConnection(connectionString)
   destinationConnection.Open()

   Using bulkCopy As New SqlBulkCopy(destinationConnection)
      bulkCopy.DestinationTableName = "InventoryItems"

      bulkCopy.WriteToServer(excelData)
   End Using

   destinationConnection.Close()
End Using

The first line of code in the above snippet creates a SqlConnection object named destinationConnection. Next, a SqlBulkCopy object is created and the destination SqlConnection object is passed in through the constructor. Following that, the destination database table name is specified via the SqlBulkCopy object's DestinationTableName method. Finally, the data is written to the database via the WriteToServer method. (Recall that excelData is a DataTable that contains the data from the uploaded Excel spreadsheet.)

Alternatively, you can omit instantiating the SqlConnection object and instead create just a SqlBulkCopy object, passing in the connection string to the destination database like so:

The following snippet shows SqlBulkCopy using existing SqlConnection object:

Using bulkCopy As New SqlBulkCopy(connectionString)
   bulkCopy.DestinationTableName = "InventoryItems"

   bulkCopy.WriteToServer(excelData)
End Using

Specifying Column Mappings


While the above code compiles without error, you'll get a runtime error when uploading the Excel spreadsheet. The reason is because when SqlBulkCopy performs the import it does so by mapping the first column of the source data to the first column of the destination table, the second column of the source data to the second column of the destination table, and so on. It does not perform the mapping by column name.

Our Excel spreadsheet has four columns, the first being for ItemID, the second for ItemName, and so forth. However, the destination database table (InventoryItems) has five columns, with its first column being an IDENTITY column (InventoryItemID). Consequently, the above code will attempt to put in the Excel spreadsheet's ItemID values into the InventoryItemID column in the database, and the Excel spreadsheet's ItemName values into the table's ItemID column, and so on.

In cases where the column positions in the source and destination do not match up precisely, you can specify the column mappings programmatically via the SqlBulkCopy object's ColumnMappings collection. Such mismatches can occur if the destination table includes an IDENTITY column or columns with default values that are not specified in the source data. Even if the source and destination include the exact same set of columns, you will need to specify column mappings if the order differs. For example, if the Excel spreadsheet's columns started with ItemID and was followed by ItemName, UnitPrice, UnitCost, but the InventoryItems table's columns were ordered ItemName, ItemID, UnitCost, UnitPrice, you would need to define a column mappings to correctly align the import. Failing to do so would result in the source's ItemID values getting entered into the destination's ItemName column, the ItemName values into the ItemID column, the UnitPrice values into the UnitCost column, and the UnitCost values into the UnitPrice column. In short, it would be a mess.

Let's enhance the above snippet of code to map each column in the source data to the column with the same name in the destination table (regardless of the columns' ordinal positions):

Using bulkCopy As New SqlBulkCopy(connectionString)
   bulkCopy.DestinationTableName = "InventoryItems"

   'Define column mappings
   For Each col As DataColumn In excelData.Columns
      bulkCopy.ColumnMappings.Add(col.ColumnName, col.ColumnName)
   Next


   bulkCopy.WriteToServer(excelData)
End Using

With the three lines of code added above, each column in the Excel spreadsheet is mapped to the column in the destination table with the same name. Because the Excel spreadsheet does not contain an InventoryItemID column, the database system will auto-generate the value like it would with a normal INSERT statement.

Comparing SqlBulkCopy to One INSERT Statement Per Import Record


Programmatically importing data from a source data store to a destination data store typically involves looping through the source data one record at a time and, for each import record, executing an INSERT statement on the destination data store. As we've seen in this article, when importing data into a Microsoft SQL Server database another alternative is to use the SqlBulkCopy object, which has two advantages:
  • Using SqlBulkCopy requires far less code. Our final snippet is seven lines of code long (excluding comments and white space). The code for inserting one record at a time will likely be at least twice as many lines long.
  • SqlBulkCopy is much faster because it is less "chatty" with the database server.
To get a feel for the performance difference between SqlBulkCopy and inserting one record at a time, I created a simple, highly unscientific test case. Namely, I created a web page that allows the visitor to choose how the data should be inserted - via SqlBulkCopy or through a loop where an INSERT statement is executed once for each source record. I then uploaded Excel spreadsheets with 10, 1,000, and 10,000 records and measured the running time between SqlBulkCopy and the one INSERT per source record approaches. The times posted below are in milliseconds (1,000 ms = 1 second). (This test page and the test data are included in the download at the end of this article.)

 10 Records1,000 Records10,000 Records
SqlBulkCopy4 ms16 ms180 ms
One INSERT statement
Per Import Record
5 ms333 ms2,978 ms

When importing 10,000 records SqlBulkCopy was an order of magnitude faster than running one INSERT statement per source record. I would expect that this performance gap would continue to grow along with the number of rows being imported.

Looking Forward...


One thing we didn't touch on in this article is SqlBulkCopy and transactions. When importing data, what do you want to do if there is an error after, say, half of the records have been added? Do you want to remove those just-inserted records or rollback the entire operation? This topic, as well as more advanced SqlBulkCopy configuration options, is examined in another article: Using Transactions with SqlBulkCopy.

Happy Programming!

  • Read Using Transactions with SqlBulkCopy

  • By Scott Mitchell


    Attachments:

  • Download the code for this article
  • Further Reading


  • SqlBulkCopy Class (technical docs)
  • Bulk Copy Operations in SQL Server
  • Read Using Transactions with SqlBulkCopy


  • ASP.NET [1.x] [2.0] | ASPMessageboard.com | ASPFAQs.com | Advertise | Feedback | Author an Article