简体   繁体   English

使用C#在SQL Server内部插入大量数据

[英]Inserting huge data inside SQL server using C#

I am making use of SQL Server 2012 and have a huge file of approx 20 GB size. 我正在使用SQL Server 2012,并且有一个大约20 GB的巨大文件。 I want to insert every record inside file into database. 我想将文件中的每个记录插入数据库。 I am using SqlBulkCopy class for this purpose. 我为此目的使用SqlBulkCopy类。 But since, the size of data is very huge I will have to insert it part by part. 但是,由于数据量非常大,因此我必须将其部分插入。 Here is the code: 这是代码:

String line;
SqlConnection conn = new SqlConnection(ConfigurationManager.ConnectionStrings["conStrtingName"].ConnectionString);
conn.Open();
StreamReader readFile = new StreamReader(filePath);
SqlTransaction transaction = conn.BeginTransaction();
try
{
    SqlBulkCopy copy = new SqlBulkCopy(conn, SqlBulkCopyOptions.KeepIdentity, transaction);
    copy.BulkCopyTimeout = 600;
    copy.DestinationTableName = "Txn";
    int counter = 0;
    while ((line = readFile.ReadLine()) != null)
    {
        string[] fields = line.Split('\t');
        if (fields.Length == 3)
        {
            DateTime date = Convert.ToDateTime(fields[0]);
            decimal txnCount = Convert.ToDecimal(fields[1]);
            string merchantName = fields[2];
            if (!string.IsNullOrEmpty(merchantName))
            {
                long MerchantId = Array.IndexOf(Program.merchantArray, merchantName) + 1;
                tables[workerId].Rows.Add(MerchantId, date, txnCount);
                counter++;
                if (counter % 100000 == 0)
                    Console.WriteLine("Worker: " + workerId + " - Transaction Records Read: " + counter);
                if (counter % 1000000 == 0)
                {
                    copy.WriteToServer(tables[workerId]);
                    transaction.Commit();
                    tables[workerId].Rows.Clear();
                    //transaction = conn.BeginTransaction();
                    Console.WriteLine("Worker: " + workerId + " - Transaction Records Inserted: " + counter);
                }
            }
        }
    }
    Console.WriteLine("Total Transaction Records Read: " + counter);
    if (tables[workerId].Rows.Count > 0)
    {
        copy.WriteToServer(tables[workerId]);
        transaction.Commit();
        tables[workerId].Rows.Clear();
        Console.WriteLine("Worker: " + workerId + " - Transaction Records Inserted: " + counter);
    }
}
catch (Exception ex)
{
    Console.WriteLine(ex.Message);
    transaction.Rollback();
}
finally
{
    conn.Close();
}

It works for first 100000 records. 它适用于前100000条记录。 However for the next set of records I get an exception The transaction is either not associated with the current connection or has been completed. 但是对于下一组记录,我得到一个异常The transaction is either not associated with the current connection or has been completed.

This happens when the control reaches to the transaction.Commit(); 当控件到达transaction.Commit();时,就会发生这种情况transaction.Commit(); for the next set of records. 用于下一组记录。

Can I have a workaround? 我可以有一个解决方法吗?

The problem is the commented line after the transaction is commit. 问题是提交事务后的注释行。 You need to uncomment it, and also reinitialize your SqlBulkCopy copy variable. 您需要取消注释, 还需要重新初始化SqlBulkCopy copy变量。 You'd better refactor your code, the only places where you need transaction and copy object is when you flush the data table that you are filling, like this (you can further factor out the repetitive part into a separate method): 最好重构代码,唯一需要事务和复制对象的地方是在刷新要填充的数据表时,就像这样(您可以将重复部分进一步分解为单独的方法):

String line;
SqlConnection conn = new SqlConnection(ConfigurationManager.ConnectionStrings["conStrtingName"].ConnectionString);
conn.Open();
StreamReader readFile = new StreamReader(filePath);
SqlTransaction transaction = null;
try
{
    int counter = 0;
    while ((line = readFile.ReadLine()) != null)
    {
        string[] fields = line.Split('\t');
        if (fields.Length == 3)
        {
            DateTime date = Convert.ToDateTime(fields[0]);
            decimal txnCount = Convert.ToDecimal(fields[1]);
            string merchantName = fields[2];
            if (!string.IsNullOrEmpty(merchantName))
            {
                long MerchantId = Array.IndexOf(Program.merchantArray, merchantName) + 1;
                tables[workerId].Rows.Add(MerchantId, date, txnCount);
                counter++;
                if (counter % 100000 == 0)
                    Console.WriteLine("Worker: " + workerId + " - Transaction Records Read: " + counter);
                if (counter % 1000000 == 0)
                {
                    transaction = conn.BeginTransaction()
                    SqlBulkCopy copy = new SqlBulkCopy(conn, SqlBulkCopyOptions.KeepIdentity, transaction);
                    copy.BulkCopyTimeout = 600;
                    copy.DestinationTableName = "Txn";
                    copy.WriteToServer(tables[workerId]);
                    transaction.Commit();
                    transaction = null;
                    tables[workerId].Rows.Clear();
                    Console.WriteLine("Worker: " + workerId + " - Transaction Records Inserted: " + counter);
                }
            }
        }
    }
    Console.WriteLine("Total Transaction Records Read: " + counter);
    if (tables[workerId].Rows.Count > 0)
    {
        transaction = conn.BeginTransaction()
        SqlBulkCopy copy = new SqlBulkCopy(conn, SqlBulkCopyOptions.KeepIdentity, transaction);
        copy.BulkCopyTimeout = 600;
        copy.DestinationTableName = "Txn";
        copy.WriteToServer(tables[workerId]);
        transaction.Commit();
        transaction = null;
        tables[workerId].Rows.Clear();
        Console.WriteLine("Worker: " + workerId + " - Transaction Records Inserted: " + counter);
    }
}
catch (Exception ex)
{
    Console.WriteLine(ex.Message);
    if (transaction != null) transaction.Rollback();
}
finally
{
    conn.Close();
}

The problem thought is that now you cannot rollback ALL the changes in case something goes wrong. 问题的思想是,现在您无法回滚所有更改,以防出现问题。 Probably the better solution would be to not manually splitting your bulk inserts, but use some sort of a IDataReader implementation to avoid populating a huge DataTable in memory (for instance using Marc Gravell 's ObjectReader ). 可能更好的解决方案是不手动拆分批量插入,而是使用某种IDataReader实现来避免在内存中填充巨大的DataTable (例如,使用Marc GravellObjectReader )。

Your transaction is committed every 100000 sets. 每100000套交易将提交一次交易。 So it is "gone", you have to start another one then with transaction = conn.BeginTransaction. 因此,它已经“消失”,您必须再启动另一个事务,然后使用transaction = conn.BeginTransaction。

Maybe good to rework the code to better reflect the lifespan of the transaction then. 也许对代码进行重做以更好地反映事务的寿命。 You also might to make sure that "copy" is recreated with the new transaction. 您可能还需要确保使用新事务重新创建“副本”。

You can increase the timeout for your transaction like this (use values appropriate for the expected length of your transaction). 您可以像这样增加事务的超时时间(使用适合您期望的事务长度的值)。 The code below is for 15 minutes: Source 下面的代码持续15分钟:

using (TransactionScope scope = 
             new TransactionScope(TransactionScopeOption.Required, 
                                   new System.TimeSpan(0, 15, 0)))
  {
      // working code here
  }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM