简体   繁体   中英

Batch insert to SQL Server table from DataTable using ODBC Connection

I have been asked to look at finding the most efficient way to take a DataTable input and write it to a SQL Server table using C#. The snag is that the solution must use ODBC Connections throughout, this rules out sqlBulkCopy. The solution must also work on all SQL Server versions back to SQL Server 2008 R2.

I am thinking that the best approach would be to use batch inserts of 1000 rows at a time using the following SQL syntax:

INSERT INTO dbo.Table1(Field1, Field2) SELECT Value1, Value2 UNION SELECT Value1, Value2

I have already written the code the check if a table corresponding to the DataTable input already exists on the SQL Server and to create one if it doesn't.

I have also written the code to create the INSERT statement itself. What I am struggling with is how to dynamically build the SELECT statements from the rows in the data table. How can I access the values in the rows to build my SELECT statement? I think I will also need to check the data type of each column in order to determine whether the values need to be enclosed in single quotes (') or not.

Here is my current code:

        public bool CopyDataTable(DataTable sourceTable, OdbcConnection targetConn, string targetTable)
    {
        OdbcTransaction tran = null;
        string[] selectStatement = new string[sourceTable.Rows.Count];

        // Check if targetTable exists, create it if it doesn't
        if (!TableExists(targetConn, targetTable))
        {
            bool created = CreateTableFromDataTable(targetConn, sourceTable);

            if (!created)
                return false;
        }

        try
        {
            // Prepare insert statement based on sourceTable
            string insertStatement = string.Format("INSERT INTO [dbo].[{0}] (", targetTable);

            foreach (DataColumn dataColumn in sourceTable.Columns)
            {
                insertStatement += dataColumn + ",";
            }

            insertStatement += insertStatement.TrimEnd(',') + ") ";

            // Open connection to target db
            using (targetConn)
            {
                if (targetConn.State != ConnectionState.Open)
                    targetConn.Open();

                tran = targetConn.BeginTransaction();

                for (int i = 0; i < sourceTable.Rows.Count; i++)
                {
                    DataRow row = sourceTable.Rows[i];

                    // Need to iterate through columns in row, getting values and data types and building a SELECT statement

                    selectStatement[i] = "SELECT ";
                }

                insertStatement += string.Join(" UNION ", selectStatement);

                using (OdbcCommand cmd = new OdbcCommand(insertStatement, targetConn, tran))
                {
                    cmd.ExecuteNonQuery();
                }

                tran.Commit();
                return true;
            }
        }       
        catch 
        {
            tran.Rollback();
            return false;
        }
    }

Any advice would be much appreciated. Also if there is a simpler approach than the one I am suggesting then any details of that would be great.

Ok since we cannot use stored procedures or Bulk Copy ; when I modelled the various approaches a couple of years ago, the key determinant to performance was the number of calls to the server. So batching a set of MERGE or INSERT statements into a single call separated by semi-colons was found to be the fastest method. I ended up batching my SQL statements. I think the max size of a SQL statement was 32k so I chopped up my batch into units of that size.

(Note - use StringBuilder instead of concatenating strings manually - it has a beneficial effect on performance)

Psuedo-code
string sqlStatement = "INSERT INTO Tab1 VALUES {0},{1},{2}";
StringBuilder sqlBatch = new StringBuilder();
foreach(DataRow row in myDataTable)
{
    sqlBatch.AppendLine(string.Format(sqlStatement, row["Field1"], row["Field2"], row["Field3"]));
    sqlBatch.Append(";");
}
myOdbcConnection.ExecuteSql(sqlBatch.ToString());

You need to deal with batch size complications, and formatting of the correct field data types in the string-replace step, but otherwise this will be the best performance.

Marked solution of PhillipH is open for several mistakes and SQL injection.

Normally you should build a DbCommand with parameters and execute this instead of executing a self build SQL statement.

The CommandText must be "INSERT INTO Tab1 VALUES ?,?,?" for ODBC and OLEDB, SqlClient needs named parameters ("@<Name>").

Parameters should be added with the dimensions of underlaying column.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM