Using Transactions with SqlBulkCopy
By Scott Mitchell
Introduction
The
SqlBulkCopy class provides a mechanism for efficiently importing
large amounts of data into a Microsoft SQL Server database. Compared to importing data by executing one INSERT statement for each record to import,
SqlBulkCopy is noticeably faster when importing thousands (or more) records. In a nutshell, importing data using SqlBulkCopy entails creating a
SqlBulkCopy object, specifying the destination database and table, and providing the data to import in the form of a DataTable, DataRow,
or DataReader. In Using SqlBulkCopy To Perform Efficient Bulk SQL Operations we looked at how to use
SqlBulkCopy to programmatically import data from an uploaded Excel spreadsheet into a SQL Server database.
While Using SqlBulkCopy To Perform Efficient Bulk SQL Operations showed how to use the SqlBulkCopy class, it did not explore how SqlBulkCopy
imports fare in the face of an error. What happens if, when importing a total of 10,000 records, an error occurs when importing record number 501? Should those first 500 records
be committed or should the entire batch be rolled back? This article examines the default behavior of the SqlBulkCopy class in the face of an error. It also
looks at how to perform bulk copy operations under the umbrella of a transaction. Read on to learn more!
Note: If you have not yet read Using SqlBulkCopy To Perform Efficient Bulk SQL Operations, please do so before reading this article...
Importing Data In Batches
One of the reasons why
SqlBulkCopy is so much faster than executing an INSERT statement for each imported record is because SqlBulkCopy
sends large batches of records to the database rather than sending one record at a time. The number of records sent by SqlBulkCopy in each batch depends on the
value of the BatchSize property. By default, this value
is set to zero, which causes SqlBulkCopy to send all records to be imported to the database in one large batch.
Sending all of the records to the database in one batch works fine when you're dealing with a few thousand records, but if you're importing hundreds of thousands or millions of records into the database then you'll need use smaller batches. (If you try to send, say, 1,000,000+ records to the database in one batch you're going to have timeout issues.) The optimal batch size depends on factors like the network bandwidth and latency and the amount of bytes per record being imported. See What is the Recommended Batch Size for SqlBulkCopy? for a discussion on this topic.
Another Reason Why SqlBulkCopy Is So Fast... |
|---|
Another reason why SqlBulkCopy is so much faster than running individual INSERT statements is because by default SqlBulkCopy ignores
any CHECK constraints. For instance, imagine that you have a table with a column of type money that has a CHECK constraint to
ensure that its values are all greater than or equal to zero. When importing data into this table using SqlBulkCopy the CHECK constraint is ignored
(by default), meaning that you can import data that violates the constraint (such as a value into the money column that is less than zero). (Similarly, by
default the rows imported by SqlBulkCopy into the destination table do not cause any INSERT TRIGGERs to fire.)
You can force |
Non-Transactioned Imports With SqlBulkCopy
When its
WriteToServer method is invoked, the
SqlBulkCopy object connects to the database and imports the data in batches (whose size is determined by the BatchSize property). If there are no
errors with the data being imported in the current batch then those records are committed to the database and then the next batch is processed. However, if there is an error
then the entire batch is rolled back.
In Using SqlBulkCopy To Perform Efficient Bulk SQL Operations, we looked at how to allow a user to upload
an Excel spreadsheet and have its rows imported into a database table named InventoryItems in a SQL Server database. The InventoryItems table included
columns named ItemID, ItemName, UnitCost, and UnitPrice (among others); and the Excel spreadsheet uploaded by the user supplied
the values for these four columns for each imported row. Note that if the Excel document omits a value for the UnitCost or UnitPrice columns, or
uses a non-money value (such as "Expensive!" rather than "4.25") then an error will occur when importing the data.
To illustrate the default commit/rollback behavior of the SqlBulkCopy class, imagine that a user was uploading an Excel spreadsheet with 100 records and that record
number 25 did not specify a value for the UnitPrice column. What would happen when the WriteToServer method is invoked?
The answer depends on the value of the BatchSize property. If the BatchSize property has not been set then all
of the records will be sent in one batch. Because there is an error in that batch (with record number 25, specifically) then none of the records in the batch will
be committed. Consequently, because there was an error in the import the entire operation was rolled back.
But what if the BatchSize property had been set to some value? What would happen then? It depends on the value of the BatchSize property. If
BatchSize was set to 10 then SqlBulkCopy would start by sending the first 10 records to the database, which would be imported and committed. It would
then send the next 10 and those would be committed as well. The third batch would include the erroneous record #25, which would generate an error. This error would cause
the entire contents of the third batch to be rolled back. Moreover, the error would cause SqlBulkCopy to stop processing. While the error in record #25
causes the third batch to be rolled back, the first two batches were committed, meaning that the database now contains the first 20 records that were imported.
| Try This At Home! |
|---|
To observe this behavior on your own you can download the demo available at the end of this article. First things first, you'll need to create an erroneous record in the
import data. I suggest opening one of the Excel spreadsheets and deleting the value from one of the row's UnitCost column. Next, open the Default.aspx page's
code-behind class and find the InsertViaSqlBulkCopyWithoutTransaction method, which starts on line 84. Set the BatchSize property to a value
less than the row number whose UnitCost column value you erased. (For example, if you deleted the value for the 5th row's UnitCost column then set
BatchSize to some value less than 5.)
Next, view the contents of the |
Bulk Operations And Atomicity
The default
SqlBulkCopy operation is not atomic. An atomic operation is one where the entire process fails or the entire process succeeds. With SqlBulkCopy
(with a non-zero BatchSize) it is possible that some of the import is committed (early batches without a problem) whereas some of the import fails (as soon
as an error is encountered the current batch is rolled back and no further batches are processed). This is, perhaps, an undesirable outcome.
Also, it is certainly possible that the import process might entail more than a call to the SqlBulkCopy object's WriteToServer method. For example,
the import process might involve three steps:
- Delete all records from the destination table,
- Import the records into the destination table, and
- Record the date and time of the most recent import in some other table.
SqlBulkCopy object used in step (2) failed mid-way through the import process we'd want to
not only rollback all of the records imported thus far by SqlBulkCopy, but we'd also want to rollback the deleted records from step (1).
Performing Atomic SqlBulkCopy Imports
To perform atomic
SqlBulkCopy imports that span across all batches (and, optionally, across other database statements) we need to use transactions.
The following steps outline the process of using a transaction with SqlBulkCopy:
- Create a SqlConnection to the destination database server.
- Open the connection.
- Create a
SqlTransactionobject. - Create the
SqlBulkCopyobject passing in theSqlTransactionobject into the constructor. - Perform the import - the call to
WriteToServer- within aTry...Catchblock. - If the operation completes, commit the transaction; if it fails, roll it back.
' STEP 1 - Create a SqlConnection to the destination database server
|
If you had other statements that needed to be performed under the umbrella of this transaction you would execute them within the Try...Catch block.
Happy Programming!
Attachments:
Further Reading
SqlBulkCopy Class (technical docs)



