简体   繁体   English

使用c#从分隔文本文件中插入SQL Server表中的批量数据

[英]Bulk data insertion in SQL Server table from delimited text file using c#

I have tab delimited text file. 我有制表符分隔的文本文件。 File is around 100MB. 文件大约100MB。 I want to store data from this file to SQL server table. 我想将此文件中的数据存储到SQL Server表。 The file contains 1 million records when stored in sql server. 存储在sql server中时,该文件包含100万条记录。 What is the best way to achieve this? 实现这一目标的最佳方法是什么?

I can create in momory datatable in c# and then upload the same to sql server, but in this case it will load entire 100 MB file to memory. 我可以在c#中的momory数据表中创建,然后将其上传到sql server,但在这种情况下,它会将整个100 MB的文件加载到内存中。 What if file size get bigger? 如果文件大小变大怎么办?

No problem; 没问题; CsvReader will handle most delimited text formats, and implements IDataReader , so can be used to feed a SqlBulkCopy . CsvReader将处理大多数分隔的文本格式,并实现IDataReader ,因此可用于提供SqlBulkCopy For example: 例如:

using (var file = new StreamReader(path))
using (var csv = new CsvReader(file, true)) // true = first row is headers
using (var bcp = new SqlBulkCopy(connectionString))
{
    bcp.DestinationTableName = "Foo";
    bcp.WriteToServer(csv);
}

Note that CsvReader has lots of options more more subtle file handling (specifying the delimiter rules, etc). 请注意, CsvReader有更多选项,更精细的文件处理(指定分隔符规则等)。 SqlBulkCopy is the high-performance bulk-load API - very efficient. SqlBulkCopy是高性能的批量加载API - 非常高效。 This is a streaming reader/writer API; 这是一个流式读写器API; it does not load all the data into memory at once. 不会立即将所有数据加载到内存中。

You should read the file line-by-line, so you don't have to load the whole line into memory: 您应该逐行读取文件,因此您不必将整行加载到内存中:

using (var file = System.IO.File.OpenText(filename))
{
    while (!file.EndOfStream)
    {
        string line = file.ReadLine();

        // TODO: Do your INSERT here
    }
}

* Update * *更新*

" This will make 1 million separate insert commands to sql server. Is there any way to make it in bulk " 这将为sql server提供100万个独立的插入命令。有没有办法让它成批量

You could use parameterised queries, which would still issue 1M inserts, but would still be quite fast. 您可以使用参数化查询,这仍然会发出1M插入,但仍然会非常快。

Alternatively, you can use SqlBulkCopy , but that's going to be rather difficult if you don't want to use 3rd party libraries. 或者,您可以使用SqlBulkCopy ,但如果您不想使用第三方库,那将会非常困难。 If you are more amenable to the MS license, you could use the LINQ Entity Data Reader (distributed under Ms-PL license), which provides the AsDataReader extension method: 如果您更适合MS许可证,则可以使用LINQ实体数据读取器 (在Ms-PL许可证下分发),它提供AsDataReader扩展方法:

void MyInsertMethod()
{
    using (var bulk = new SqlBulkCopy("MyConnectionString"))
    {
        bulk.DestinationTableName = "MyTableName";
        bulk.WriteToServer(GetRows().AsDataReader());
    }
}

class MyType
{
    public string A { get; set; }
    public string B { get; set; }
}

IEnumerable<MyType> GetRows()
{
    using (var file = System.IO.File.OpenText("MyTextFile"))
    {
        while (!file.EndOfStream)
        {
            var splitLine = file.ReadLine().Split(',');

            yield return new MyType() { A = splitLine[0], B = splitLine[1] };
        }
    }
}

If you didn't want to use the MS licensed code either, you could implement IDataReader yourself, but that is going to be a PITA. 如果您不想使用MS许可代码,您可以自己实现IDataReader ,但这将是PITA。 Note that the CSV handling above ( Split(',') ) is not at all robust, and also that column names in the table must be the same as property names on MyType . 请注意,上面的CSV处理( Split(',') )并不健全,并且表中的列名必须与MyType上的属性名相同。 TBH, I'd recommend you go with Marc's answer on this one TBH,我建议你选择Marc的答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM