简体   繁体   English

在SQLite中缓慢批量/批量插入

[英]Slow Batch/Bulk Insert in SQLite

I trying to import data from a csv file into a sqlite table. 我试图将数据从csv文件导入sqlite表。 My test data is only about 8Mb(50,000 rows) and takes about 15 seconds. 我的测试数据只有大约8Mb(50,000行),大约需要15秒。 However production data is almost 400Mb, and takes forever (at least 30mins+, I gave up waiting). 然而,生产数据几乎是400Mb,并且需要永远(至少30分钟+,我放弃了等待)。

After much research, I discovered the need to do the inserts in a single transaction (that got me to the 15 second import, great advice! :) ) So that's not the problem. 经过大量的研究,我发现需要在一次交易中进行插入(这让我得到15秒的导入,很棒的建议!:))所以这不是问题所在。 (AFAIK) (据我所知)

I'm also using "ExecuteNonQuery() on a parameterized INSERT statement" as per this Robert Simpson post - and numerous variations. 我也在参数化的INSERT语句中使用“ExecuteNonQuery()”,正如罗伯特辛普森的帖子 - 以及许多变化。

I was just using TextReader.ReadLine() and String.Split('\\t') , then I read somewhere about ReadLine() being slow due to the number of disk reads, so I looked into reading a bufferedStream, and came across this csv reader . 我只是使用TextReader.ReadLine()String.Split('\\t') ,然后由于磁盘读取的数量,我读到有关ReadLine()某个地方很慢,所以我调查了一个bufferedStream,然后遇到了这个csv读者 But Still no noticible change in performance. 但仍然没有明显的性能变化。

So, I commented out guts of my insert loop and the read happens near instantly - so I am sure the problem is in my inserting. 所以,我评论了插入循环的内容,并且读取发生在附近 - 所以我确定问题出在我的插入中。 I've tried numerous of variations of creating the parameterised queries + single transaction, but all with near identical results.. 我已经尝试了许多创建参数化查询+单个事务的变体,但都具有几乎相同的结果..

Here's the regular version of my code. 这是我的代码的常规版本。 Thanks in advance, This is driving me nuts! 在此先感谢,这让我疯了! I'm about to try importing to dataset and inserting that?.... 我即将尝试导入数据集并插入?....

using (TextReader tr = File.OpenText(cFile))
{                       
    using (SQLiteConnection cnn = new SQLiteConnection(connectionString))
    {
        string line;
        string insertCommand = "INSERT INTO ImportTable VALUES (@P0,@P1,@P2,@P3,@P4)";

        cnn.Open();
        SQLiteCommand cmd = new SQLiteCommand("begin", cnn);
        cmd.ExecuteNonQuery();

        cmd.CommandText = insertCommand;

        while ((line = tr.ReadLine()) != null)
        {
            string[] items = line.Split('\t');

            cmd.Parameters.AddWithValue("@P0", items[0]);
            cmd.Parameters.AddWithValue("@P1", items[1]);
            cmd.Parameters.AddWithValue("@P2", items[2]);
            cmd.Parameters.AddWithValue("@P3", items[3]);
            cmd.Parameters.AddWithValue("@P4", items[4]);
            cmd.ExecuteNonQuery();
        }
        cmd.CommandText = "end";
        cmd.ExecuteNonQuery(); 
    }              
}

Update: I just tried using the insert with the parameters (just hard-coded some values), less than 5 seconds... still not as fast as the articles I've seen... 更新:我刚尝试使用带有参数的插入(只是硬编码了一些值),不到5秒......仍然没有我见过的文章那么快......

Also, I'm running a Core2 Duo (3Ghz) with 2G Ram, XP. 另外,我正在使用2G Ram,XP运行Core2 Duo(3Ghz)。

So I think I've worked out the problem - or at least found a solution. 所以我认为我已经解决了这个问题 - 或者至少找到了解决方案。

Since I'd exhausted all my code options (and it didn't look like anybody had an answer/problem with my code), I decided the problem may lie within the database itself... 由于我已经用尽了所有的代码选项(并且看起来没有任何人对我的代码有任何答案/问题),我认为问题可能在于数据库本身......

I had created my database and tables all within SQLite Manager Firefox Plugin. 我在SQLite Manager Firefox插件中创建了我的数据库和表。

So I recreated everything from the commandshell, and BOOM! 所以我从命令壳和BOOM重新创建了所有内容! My import dropped down to just a few seconds! 我的导入下降到几秒钟!

I knew there was a problem with it being unable to handle 64bit integers (but just used TEXT datatypes). 我知道它无法处理64位整数(但只使用TEXT数据类型)存在问题。 Perhaps there is a problem with SQLite Manager using a different SQLite engine to the .Net version? 也许SQLite Manager使用不同的SQLite引擎到.Net版本有问题? I don't know. 我不知道。

My next step might be to actually create the db + tables from within my application, instead of having them preprepared... But I'm fairly satisfied with the performance now, so that's not a priority. 我的下一步可能是从我的应用程序中实际创建db +表,而不是让它们预先准备好...但我现在对性能相当满意,所以这不是优先考虑的事情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM