简体   繁体   English

将文本文件导入数据库的最快方法

[英]Fastest way to import text file into database

I've a text file containing record of a dump of database table in a custom format, this text have a character in a specific position that identifies the operation with the record : 我有一个文本文件,其中包含自定义格式的数据库表转储的记录,该文本在特定位置具有一个字符,该字符标识该记录的操作:

  • M = Insert or Update M =插入或更新
  • D = Delete Record D =删除记录

Therefore if i find a D record in text file i need to delete record into database, instead if i find an M record i need to Insert the record if not exist in the database, if already exist i need to update it. 因此,如果我在文本文件中找到D记录,则需要将记录删除到数据库中,如果我发现了M记录,则需要在数据库中不存在的情况下插入记录,如果已经存在,则需要对其进行更新。

What is the better and fastest way to import a similar text file in a database table using .NET Framework and c#? 使用.NET Framework和C#在数据库表中导入类似文本文件的更好,最快的方法是什么? I have 300000 record of average in this text file. 我在此文本文件中有300000条平均记录。

Thanks 谢谢

The easiest way is probably to use ADO.NET to create a typed datatable to load the data into and set the datarowstate accordingly, then flush the data via a DataAdapter. 最简单的方法可能是使用ADO.NET创建类型化的数据表以将数据加载到其中并相应地设置datarowstate,然后通过DataAdapter刷新数据。

The fastest way is probably creating a bulk SQL-Script to execute. 最快的方法可能是创建要执行的批量SQL脚本。 LinQ can save you a lot of time when selecting the data (you can probably transform it on the fly). 当选择数据时,LinQ可以为您节省大量时间(您可能可以即时对其进行转换)。

There are also platform specific solutions that should be considered. 还应考虑特定于平台的解决方案。 See here a bulk insert for SQLServer. 请参见此处的SQLServer批量插入。

http://dotnetslackers.com/Community/blogs/xun/archive/2008/04/15/sql-bulk-insert-and-ado-net-sqlbulkcopy.aspx http://dotnetslackers.com/Community/blogs/xun/archive/2008/04/15/sql-bulk-insert-and-ado-net-sqlbulkcopy.aspx

为什么不解析文本并生成插入,更新和删除语句,然后仅运行生成的脚本?

There is no easy way to do this, you'll have to parse the text no matter what to determine what SQL statement you need to run. 没有简单的方法可以执行此操作,无论如何确定要运行的SQL语句,都必须解析文本。 You'll have to decide for yourself whether or not it's an update or an insert statement, hopefully you can do batches, otherwise hitting the database every time you hit an "M" so often isn't gonna be a good idea. 您必须自己决定是更新还是插入语句,希望可以进行批处理,否则每次单击“ M”时都要访问数据库,所以通常不是一个好主意。

If you are using SQL Server, you could take advantage of the Bulk Insert functionality . 如果您使用的是SQL Server,则可以利用批量插入功能 That should be the fastest way to insert data from a file into the database. 那应该是将数据从文件插入数据库的最快方法。 The first thing I would do is insert the data in your file into a "landing table" (ie a table whose structure matches the structure of your file). 我要做的第一件事是将文件中的数据插入“登陆表”(即结构与文件结构相匹配的表)中。 Also note: .NET 2.0 introduced SqlBulkCopy , which would be similarly useful if you already have the data in memory or are reading it with some kind of DataReader. 另请注意: .NET 2.0引入了SqlBulkCopy ,如果您已经在内存中存储了数据或正在使用某种DataReader读取数据,则同样有用。

Once the contents of your file are inserted into your landing table, you can then execute a series of SQL statements to merge your landing table into your target table(s). 将文件的内容插入到目标表后,即可执行一系列SQL语句将目标表合并到目标表中。 Below is an example implementation of those SQL statements (Disclaimer: I did not check these for correctness): 下面是这些SQL语句的示例实现(免责声明:我没有检查它们的正确性):

DELETE FROM MyTable
WHERE EXISTS (
    SELECT 1
    FROM LandingTable
    WHERE
        LandingTable.RecordType = 'D'
        AND LandingTable.KeyField1 = MyTable.KeyField1
        AND LandingTable.KeyField2 = MyTable.KeyField2


UPDATE MyTable SET
    MyTable.Field1 = LandingTable.Field1,
    MyTable.Field2 = LandingTable.Field2,
    -- ...
FROM MyTable
INNER JOIN LandingTable ON
    LandingTable.KeyField1 = MyTable.KeyField1
    AND LandingTable.KeyField2 = MyTable.KeyField2
where
    LandingTable.RecordType = 'U'

INSERT INTO MyTable (
    Field1,
    Field2,
    -- ...
)
SELECT
    LandingTable.Field1,
    LandingTable.Field2,
    -- ...
FROM LandingTable
WHERE
    LandingTable.RecordType = 'I'

-- Consider how to handle "Insert" records where there is already a record in MyTable with the same key
-- Possible solution below: treat as an "Update"
UPDATE MyTable SET
    MyTable.Field1 = LandingTable.Field1,
    MyTable.Field2 = LandingTable.Field2,
    -- ...
FROM MyTable
INNER JOIN LandingTable ON
    LandingTable.KeyField1 = MyTable.KeyField1
    AND LandingTable.KeyField2 = MyTable.KeyField2
where
    LandingTable.RecordType = 'I'

-- Now only insert records from LandingTable where there is no corresponding record in MyTable with the same key (determined with a left outer join)
INSERT INTO MyTable (
    Field1,
    Field2,
    -- ...
)
SELECT
    LandingTable.Field1,
    LandingTable.Field2,
    -- ...
FROM LandingTable
LEFT OUTER JOIN MyTable ON
    MyTable.KeyField1 = LandingTable.KeyField1
    AND MyTable.KeyField2 = LandingTable.KeyField2
WHERE
    LandingTable.RecordType = 'I'
    and MyTable.KeyField1 is null

Links that I found after doing a quick search: 快速搜索后找到的链接:

插入到临时表,然后联接到UPDATE或DELETE。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM