Using SqlBulkCopy To Perform Efficient Bulk SQL Operations
By Scott Mitchell
Introduction
Over the years I've worked on a number of projects that have shared a common requirement - the ability for users to quickly import large amounts of data into a back end SQL Server database. One such project was a web application used by teachers and other staff members. The software was initially purchased for just two schools in the district, but was soon expanded to encompass other schools. Every few months one or two new schools were brought into the fold; every time a new school was added an administrative user would have to create accounts for the new teachers and staff members so that they could sign into the site. Initially, the application offered a web page for the administrator to create new user accounts one at a time, but this interface quickly became tedious and impractical once larger schools with upwards of 100 users were brought online.
To allow school administrators to quickly import new users we created a web page from which a user could upload an Excel spreadsheet that contained the one row for every new user; the columns in the spreadsheet mapped to table columns in the database. After uploading this spreadsheet, the application would walk through each row and insert a record into the table. Through this mechanism an administrator could create the 100+ user accounts by first building an Excel spreadsheet (something many of the schools already had on file) and then upload that spreadsheet. This technique may sound familiar - in Importing an Excel Spreadsheet Using Typed DataSets and TableAdapters, author Nannette Thacker walked through building such an interface.
While the described approach works well when importing hundreds of records, it starts to seriously slow down when importing thousands or tens of thousands of records.
The slowdown is due to the fact that each imported record sends its own INSERT statement to the database. This results in a lot of "chatter" between the web
server and the database server. If you are importing data to Microsoft SQL Server the good news is that this process can be dramatically sped up using ADO.NET's
SqlBulkCopy class. In my testing, importing 10,000 records using
the one INSERT statement per import record took more than three seconds to complete; using SqlBulkCopy took a fraction of a second.
This article looks at how to use the SqlBulkCopy class to efficiently execute bulk operations against Microsoft SQL Server. Read on to learn more!
An Overview of SqlBulkCopy and the Demo Application
SqlBulkCopy is a class that was added to the .NET Framework version 2.0 (in the System.Data.SqlClient namespace). It provides an efficient means
to import data into a Microsoft SQL Server database. Before using the SqlBulkCopy class, you first need to get the data to import into Microsoft SQL Server.
The import data can come from virtually any data source, as SqlBulkCopy can import data from a DataReader, DataRow, or DataTable.
Once the import data is ready, you create an instance of the SqlBulkCopy class and specify the destination database and the table to import the data into.
You can optionally specify column mappings - indicating what columns in the source data are mapped to what columns in the destination table - along with other options.
Finally, to perform the import, call the SqlBulkCopy object's WriteToServer method, passing in the data to import.
The download available at the end of this article includes a demo that imports data from an uploaded Excel spreadsheet into a table in a Microsoft SQL Server database.
The demo aims to mimic an inventory tracking web application that offers users the ability to import data into the system by uploading an Excel spreadsheet.
The SQL Server database, which is located in the App_Data folder, contains a single table named InventoryItems with the following schema:
| Column | Type | Notes |
|---|---|---|
InventoryItemID | int | Primary key; IDENTITY column |
ItemID | varchar(50) | |
ItemName | nvarchar(100) | |
UnitCost | money | |
UnitPrice | money | |
DateAdded | datetime | Has a default value: getdate() |
The Excel spreadsheet includes four columns that specify the values that will be imported into the ItemID, ItemName, UnitCost, and
UnitPrice columns; the InventoryItemID and DateAdded values will be automatically assigned by the database.
Preparing the Excel Spreadsheet Data
The demo available for download includes a web page (
Default.aspx) from which a user can upload an Excel spreadsheet. On upload, the Excel spreadsheet is saved
to the web server's file system. Next, the spreadsheet's contents are read into a DataTable named excelData, and this DataTable is what will
eventually be passed into the SqlBulkCopy object's WriteToServer method to perform the import.
The following code shows how to connect to the Excel spreadsheet and read its contents into a DataTable. The variable uploadFileName contains the
physical path to the location where the just-uploaded Excel spreadsheet was saved on the web server's file system. For more information on querying Excel spreadsheets
from .NET code, see Importing an Excel Spreadsheet Using Typed DataSets and TableAdapters.
Dim connectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
|
Note: The connection string used in this example works for Excel spreadsheets saved in the Excel 97-2003 format (.xls). See
Connection Strings for Excel 2007 to learn how to connect to an Excel 2007 document.
Creating, Configuring, and Importing Data With The SqlBulkCopy Object
To use the
SqlBulkCopy class you need to supply it information about the database connection to use. This can be in the form of a SqlConnection object
or as a connection string. You also need to specify the name of the destination database table. Once these required bits of information have been set, you can import the
data by calling the WriteToServer method and passing in the source data.
The following snippet shows how to import data using the SqlBulkCopy object when using an explicit SqlConnection object:
Using destinationConnection As New SqlConnection(connectionString)
|
The first line of code in the above snippet creates a SqlConnection object named destinationConnection. Next, a SqlBulkCopy object is created
and the destination SqlConnection object is passed in through the constructor. Following that, the destination database table name is specified via the
SqlBulkCopy object's DestinationTableName method. Finally, the data is written to the database via the WriteToServer method.
(Recall that excelData is a DataTable that contains the data from the uploaded Excel spreadsheet.)
Alternatively, you can omit instantiating the SqlConnection object and instead create just a SqlBulkCopy object, passing in the connection string to
the destination database like so:
The following snippet shows SqlBulkCopy using existing SqlConnection object:
Using bulkCopy As New SqlBulkCopy(connectionString)
|
Specifying Column Mappings
While the above code compiles without error, you'll get a runtime error when uploading the Excel spreadsheet. The reason is because when
SqlBulkCopy performs
the import it does so by mapping the first column of the source data to the first column of the destination table, the second column of the source data to the second column
of the destination table, and so on. It does not perform the mapping by column name.
Our Excel spreadsheet has four columns, the first being for ItemID, the second for ItemName, and so forth. However, the destination database table
(InventoryItems) has five columns, with its first column being an IDENTITY column (InventoryItemID). Consequently, the above
code will attempt to put in the Excel spreadsheet's ItemID values into the InventoryItemID column in the database, and the Excel spreadsheet's
ItemName values into the table's ItemID column, and so on.
In cases where the column positions in the source and destination do not match up precisely, you can specify the column mappings programmatically
via the SqlBulkCopy object's ColumnMappings collection. Such mismatches can occur if the destination table includes an IDENTITY column
or columns with default values that are not specified in the source data. Even if the source and destination include the exact same set of columns, you will need to
specify column mappings if the order differs. For example, if the Excel spreadsheet's columns started with ItemID and was followed by ItemName,
UnitPrice, UnitCost, but the InventoryItems table's columns were ordered ItemName, ItemID, UnitCost,
UnitPrice, you would need to define a column mappings to correctly align the import. Failing to do so would result in the source's ItemID values
getting entered into the destination's ItemName column, the ItemName values into the ItemID column, the UnitPrice values into
the UnitCost column, and the UnitCost values into the UnitPrice column. In short, it would be a mess.
Let's enhance the above snippet of code to map each column in the source data to the column with the same name in the destination table (regardless of the columns' ordinal positions):
Using bulkCopy As New SqlBulkCopy(connectionString)
|
With the three lines of code added above, each column in the Excel spreadsheet is mapped to the column in the destination table with the same name. Because the Excel spreadsheet
does not contain an InventoryItemID column, the database system will auto-generate the value like it would with a normal INSERT statement.
Comparing SqlBulkCopy to One INSERT Statement Per Import Record
Programmatically importing data from a source data store to a destination data store typically involves looping through the source data one record at a time and, for each import record, executing an
INSERT statement on the destination data store. As we've seen in this article, when importing data into a Microsoft SQL Server database
another alternative is to use the SqlBulkCopy object, which has two advantages:
- Using
SqlBulkCopyrequires far less code. Our final snippet is seven lines of code long (excluding comments and white space). The code for inserting one record at a time will likely be at least twice as many lines long. SqlBulkCopyis much faster because it is less "chatty" with the database server.
SqlBulkCopy and inserting one record at a time, I created a simple, highly unscientific test case. Namely,
I created a web page that allows the visitor to choose how the data should be inserted - via SqlBulkCopy or through a loop where an INSERT statement
is executed once for each source record. I then uploaded Excel spreadsheets with 10, 1,000, and 10,000 records and measured the running time between SqlBulkCopy and
the one INSERT per source record approaches. The times posted below are in milliseconds (1,000 ms = 1 second). (This test page and the test data are included
in the download at the end of this article.)
| 10 Records | 1,000 Records | 10,000 Records | |
|---|---|---|---|
SqlBulkCopy | 4 ms | 16 ms | 180 ms |
One INSERT statementPer Import Record | 5 ms | 333 ms | 2,978 ms |
When importing 10,000 records SqlBulkCopy was an order of magnitude faster than running one INSERT statement per source record. I would expect
that this performance gap would continue to grow along with the number of rows being imported.
Looking Forward...
One thing we didn't touch on in this article is
SqlBulkCopy and transactions. When importing data, what do you want to do if there is an error after, say,
half of the records have been added? Do you want to remove those just-inserted records or rollback the entire operation? This topic, as well as more advanced
SqlBulkCopy configuration options, is examined in another article: Using Transactions
with SqlBulkCopy.
Happy Programming!
Attachments:
Further Reading
SqlBulkCopy Class (technical docs)



