简体   繁体   English

将csv导入SQL Server的快速而简单的方法

[英]Fast and simple way to import csv to SQL Server

We are importing a csv file with CSVReader then using SqlBulkCopy to insert that data into SQL Server. 我们使用CSVReader导入csv文件,然后使用SqlBulkCopy将该数据插入SQL Server。 This code works for us and is very simple, but wondering if there is a faster method (some of our files have 100000 rows) that would also not get too complex? 这段代码对我们很有用,而且非常简单,但是想知道是否有更快的方法(我们的一些文件有100000行)也不会太复杂?

        SqlConnection conn = new SqlConnection(connectionString);
        conn.Open();
        SqlTransaction transaction = conn.BeginTransaction();
        try
        {
            using (TextReader reader = File.OpenText(sourceFileLocation))
            {
                CsvReader csv = new CsvReader(reader, true);
                SqlBulkCopy copy = new SqlBulkCopy(conn, SqlBulkCopyOptions.KeepIdentity, transaction);
                copy.DestinationTableName = reportType.ToString();
                copy.WriteToServer(csv);
                transaction.Commit();
            }
        }
        catch (Exception ex)
        {
            transaction.Rollback();
            success = false;
            SendFileImportErrorEmail(Path.GetFileName(sourceFileLocation), ex.Message);
        }
        finally
        {
            conn.Close();
        }

Instead of building your own tool to do this, have a look at SQL Server Import and Export / SSIS. 不要构建自己的工具来执行此操作,而是查看SQL Server导入和导出 / SSIS。 You can target flat files and SQL Server databases directly. 您可以直接定位平面文件和SQL Server数据库。 The output dtsx package can also be run from the command line or as a job through the SQL Server Agent. 输出dtsx包也可以从命令行运行,也可以通过SQL Server代理作为作业运行。

The reason I am suggesting it is because the wizard is optimized for parallelism and works really well on large flat files. 我建议它的原因是因为向导针对并行性进行了优化,并且在大型平面文件上工作得非常好。

You should consider using a Table-Valued Parameter (TVP), which is based on a User-Defined Table Type (UDTT). 您应该考虑使用表值参数(TVP),它基于用户定义的表类型(UDTT)。 This ability was introduced in SQL Server 2008 and allows you to define a strongly-typed structure that can be used to stream data into SQL Server (if done properly). 此功能是在SQL Server 2008中引入的,允许您定义一个强类型结构,可用于将数据流式传输到SQL Server(如果正确完成)。 An advantage of this approach over using SqlBulkCopy is that you can do more than a simple INSERT into a table; 与使用SqlBulkCopy相比,这种方法的一个优点是,您可以在表中INSERT简单的INSERT ; you can do any logic that you want (validate / upsert / etc) since the data arrives in the form of a Table Variable. 你可以做任何你想要的逻辑(验证/ upsert / etc),因为数据以表变量的形式到达。 You can deal with all of the import logic in a single Stored Procedure that can easily use local temporary tables if any of the data needs to be staged first. 您可以在单个存储过程中处理所有导入逻辑,如果需要先暂存任何数据,则可以轻松使用本地临时表。 This makes it rather easy to isolate the process such that you can run multiple instances at the same time as long as you have a way to logically separate the rows being imported. 这使得隔离进程非常容易,只要您有办法逻辑地分离要导入的行,就可以同时运行多个实例。

I posted a detailed answer on this topic here on SO a while ago, including example code and links to other info: 我刚才在这里发布了关于这个主题的详细答案,包括示例代码和其他信息的链接:

How can I insert 10 million records in the shortest time possible? 如何在最短的时间内插入1000万条记录?

There is even a link to a related answer of mine that shows another variation on that theme. 甚至还有一个链接到我的相关答案,显示该主题的另一个变化。 I have a third answer somewhere that shows a batched approach if you have millions of rows, which you don't, but as soon as I find that I will add the link here. 我有一个第三个答案显示批量方法,如果你有数百万行,你没有,但一旦我发现我将在这里添加链接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM