SqlBulkCopy 數據表，因為它們被添加到數據集中

Question

我想將 csv 中的值解析為數據表塊，將它們添加到數據集，然后使用 SQLBulkCopy 將數據表插入到 SQL 中的單個表中。 原始 csv 的范圍可以從 4 GB 到 8 GB，我需要避免將整個內容讀入內存，因此分塊。 我粗略地根據這篇文章進行了分塊。 我使用LumenWorks來解析 csv 值。

將數據表添加到數據集后，我想使用 SqlBulkCopy 將其插入到我的 SQL 表中，同時創建下一個數據表。 SqlBulkCopy 完成后，我想刪除數據表以釋放內存。

我的第一個想法是在沒有 await 的情況下異步運行分塊方法，然后運行一個 while 循環來檢查數據集中是否存在下一個數據表。 如果數據表存在，則批量復制。 如果數據表行數小於行限制，則它是最后一個塊並停止 while 循環。

我會以錯誤的方式解決這個問題嗎？ 如果沒有，我怎么能做這樣的事情？

        string filePath = @"C:\Users\user\Downloads\Testing\file - Copy.csv";
        DataSet ds = new DataSet();

        bool continueInsert = true;
        int rowLimit = 100000;
        int tableNumber = 0;

        //Start this, but do not wait for it to complete before starting while loop
        ChunkCSV(filePath, ds, rowLimit);

        //Run SqlBulkCopy if datatable exists 
        while (continueInsert)
        {
            if (ds.Tables.Contains("tbl_" + tableNumber))
            {
                DataTable dataTable = ds.Tables["tbl_" + tableNumber];

                //SqlBulkCopy dataTable code HERE

                if (ds.Tables["tbl_" + tableNumber].Rows.Count < rowLimit)
                {
                    continueInsert = false;
                }

                //Remove datatable from dataset to release memory
                ds.Tables.Remove("tbl_" + tableNumber);

                tableNumber++;
            }
            else
            {
                Thread.Sleep(1000);
            }
        }

這是我的分塊代碼：

    private static void ChunkCSV(string filePath, DataSet dataSet, int rowLimit)
    {
        char delimiter = ',';

        DataTable dtChunk = null;
        int tableNumber = 0;
        int chunkRowCount = 0;
        bool firstLineOfChunk = true;

        using (var sr = new StreamReader(filePath))
        using (CsvReader csv = new CsvReader(sr, false, delimiter, '\"', '\0', '\0', ValueTrimmingOptions.All, 65536))
        {
            int fieldCount = csv.FieldCount;
            string[] row = new string[fieldCount];

            //Add fields when necessary
            csv.MissingFieldAction = MissingFieldAction.ReplaceByEmpty;

            while (csv.ReadNextRecord())
            {
                if (firstLineOfChunk)
                {
                    firstLineOfChunk = false;
                    dtChunk = CreateDataTable(fieldCount, tableNumber);
                }

                DataRow dataRow = dtChunk.NewRow();

                csv.CopyCurrentRecordTo(row);
                for (int f = 0; f < fieldCount; f++)
                {
                    dataRow[f] = row[f];
                }

                dtChunk.Rows.Add(dataRow);
                chunkRowCount++;

                if (chunkRowCount == rowLimit)
                {
                    firstLineOfChunk = true;
                    chunkRowCount = 0;
                    tableNumber++;
                    dataSet.Tables.Add(dtChunk);
                    dtChunk = null;
                }
            }
        }

        if (dtChunk != null)
        {
            dataSet.Tables.Add(dtChunk);
        }

    }
    private static DataTable CreateDataTable(int fieldCount, int tableNumber)
    {
        DataTable dt = new DataTable("tbl_" + tableNumber);

        for(int i = 0; i < fieldCount; i++)
        {
            dt.Columns.Add("Column_" + i);
        }

        return dt;
    }

Answer 1

沒有理由一開始就使用 DataTable。

使用SqlBulkCopy.WriteToServer(IDataReader)重載，您可以將整個文件直接流式傳輸到 SQL Server。 如果您不想在單個事務中加載所有行，請使用SqlBulkCopy.BatchSize 。

例如

using (var sr = new StreamReader(filePath))
using (CsvReader csv = new CsvReader(sr, false, delimiter, '\"', '\0', '\0', ValueTrimmingOptions.All, 65536))
{
    bulkCopy.WriteToServer(csv);
}

SqlBulkCopy 數據表，因為它們被添加到數據集中

問題描述

1 個解決方案

解決方案1
3 已采納 2020-01-03 17:56:28

SqlBulkCopy 數據表，因為它們被添加到數據集中

問題描述

1 個解決方案

解決方案1 3 已采納 2020-01-03 17:56:28

解決方案1
3 已采納 2020-01-03 17:56:28