简体   繁体   English

如何使用 SQL CE 加速 LINQ 插件?

[英]How to speed up LINQ inserts with SQL CE?

History历史

I have a list of "records" (3,500) which I save to XML and compress on exit of the program.我有一个“记录”列表(3,500),我将其保存到 XML 并在程序退出时压缩。 Since:自从:

  • the number of the records increases记录数增加
  • only around 50 records need to be updated on exit退出时只需要更新大约 50 条记录
  • saving takes about 3 seconds保存大约需要 3 秒

I needed another solution -- embedded database.我需要另一个解决方案——嵌入式数据库。 I chose SQL CE because it works with VS without any problems and the license is OK for me (I compared it to Firebird , SQLite , EffiProz , db4o and BerkeleyDB ).我选择了 SQL CE,因为它可以与 VS 一起使用而没有任何问题,并且许可证对我来说还可以(我将它与FirebirdSQLiteEffiProzdb4oBerkeleyDB进行了比较)。

The data数据

The record structure: 11 fields, 2 of them make primary key (nvarchar + byte).记录结构:11个字段,其中2个做主键(nvarchar + byte)。 Other records are bytes, datatimes, double and ints.其他记录是字节、数据时间、双精度和整数。

I don't use any relations, joins, indices (except for primary key), triggers, views, and so on.我不使用任何关系、连接、索引(除了主键)、触发器、视图等等。 It is flat Dictionary actually -- pairs of Key+Value.它实际上是平面字典——键+值对。 I modify some of them, and then I have to update them in database.我修改了其中一些,然后我必须在数据库中更新它们。 From time to time I add some new "records" and I need to store (insert) them.我不时添加一些新的“记录”,我需要存储(插入)它们。 That's all.就这样。

LINQ approach LINQ方法

I have blank database (file), so I make 3500 inserts in a loop (one by one).我有空白数据库(文件),所以我在一个循环中插入了 3500 个(一个接一个)。 I don't even check if the record already exists because db is blank.我什至不检查记录是否已经存在,因为 db 是空白的。

Execution time?执行时间处理时间? 4 minutes, 52 seconds. 4分52秒。 I fainted (mind you: XML + compress = 3 seconds).我晕倒了(请注意:XML + compress = 3 秒)。

SQL CE raw approach SQL CE原始方法

I googled a bit, and despite such claims as here: LINQ to SQL (CE) speed versus SqlCe stating it is SQL CE itself fault I gave it a try.我用谷歌搜索了一下,尽管有这样的说法: LINQ 到 SQL(CE)速度与 SqlCe 相比,它声明它是 SQL CE 本身给了它一个错误。

The same loop but this time inserts are made with SqlCeResultSet (DirectTable mode, see: Bulk Insert In SQL Server CE ) and SqlCeUpdatableRecord.相同的循环,但这次插入是使用 SqlCeResultSet(DirectTable 模式,请参阅: Bulk Insert In SQL Server CE )和 SqlCeUpdatableRecord。

The outcome?结果? Do you sit comfortably?你坐得舒服吗? Well... 0.3 second (yes, fraction of the second.).嗯...... 0.3 秒(是的,秒的一小部分。)。

The problem问题

LINQ is very readable, and raw operations are quite contrary. LINQ 可读性很强,原始操作完全相反。 I could write a mapper which translates all column indexes to meaningful names, but it seems like reinventing the wheel -- after all it is already done in... LINQ.我可以编写一个映射器,将所有列索引转换为有意义的名称,但这似乎是在重新发明轮子——毕竟它已经在...... LINQ 中完成。

So maybe it is a way to tell LINQ to speed things up?所以也许这是一种告诉 LINQ 加快速度的方法? QUESTION -- how to do it?问题——怎么做?

The code编码

LINQ LINQ

foreach (var entry in dict.Entries.Where(it => it.AlteredByLearning))
{
    PrimLibrary.Database.Progress record = null;

        record = new PrimLibrary.Database.Progress();
        record.Text = entry.Text;
        record.Direction = (byte)entry.dir;
        db.Progress.InsertOnSubmit(record);

    record.Status = (byte)entry.LastLearningInfo.status.Value;
    // ... and so on

    db.SubmitChanges();
}

Raw operations原始操作

SqlCeCommand cmd = conn.CreateCommand(); SqlCeCommand cmd = conn.CreateCommand();

cmd.CommandText = "Progress"; cmd.CommandText = "进度"; cmd.CommandType = System.Data.CommandType.TableDirect; cmd.CommandType = System.Data.CommandType.TableDirect; SqlCeResultSet rs = cmd.ExecuteResultSet(ResultSetOptions.Updatable); SqlCeResultSet rs = cmd.ExecuteResultSet(ResultSetOptions.Updatable);

foreach (var entry in dict.Entries.Where(it => it.AlteredByLearning))
{
    SqlCeUpdatableRecord record = null;

    record = rs.CreateRecord();

    int col = 0;
    record.SetString(col++, entry.Text);
    record.SetByte(col++,(byte)entry.dir);
    record.SetByte(col++,(byte)entry.LastLearningInfo.status.Value);
    // ... and so on

    rs.Insert(record);
}

Do more work per transaction.每笔交易做更多的工作。

Commits are generally very expensive operations for a typical relational database as the database must wait for disk flushes to ensure data is not lost ( ACID guarantees and all that).对于典型的关系数据库而言,提交通常是非常昂贵的操作,因为数据库必须等待磁盘刷新以确保数据不会丢失( ACID 保证等等)。 Conventional HDD disk IO without specialty controllers is very slow in this sort of operation: the data must be flushed to the physical disk -- perhaps only 30-60 commits can occur a second with an IO sync between!没有专用控制器的传统 HDD 磁盘 IO 在此类操作中非常慢必须将数据刷新到物理磁盘——在 IO 之间同步时,可能每秒只能发生 30-60 次提交!

See the SQLite FAQ: INSERT is really slow - I can only do few dozen INSERTs per second .请参阅 SQLite 常见问题解答: INSERT 真的很慢 - 我每秒只能执行几十次 INSERT Ignoring the different database engine, this is the exact same issue.忽略不同的数据库引擎,这是完全相同的问题。

Normally, LINQ2SQL creates a new implicit transaction inside SubmitChanges .通常,LINQ2SQL 在SubmitChanges创建一个新的隐式事务 To avoid this implicit transaction/commit ( commits are expensive operations ) either:为了避免这种隐式事务/提交(提交是昂贵的操作),要么:

  1. Call SubmitChanges less (say, once outside the loop) or;少调用SubmitChanges (比如,一旦在循环之外)或;

  2. Setup an explicit transaction scope (see TransactionScope ).设置显式事务 scope(请参阅TransactionScope )。

One example of using a larger transaction context is:使用更大事务上下文的一个示例是:

using (var ts = new TransactionScope()) {
  // LINQ2SQL will automatically enlist in the transaction scope.
  // SubmitChanges now will NOT create a new transaction/commit each time.
  DoImportStuffThatRunsWithinASingleTransaction();
  // Important: Make sure to COMMIT the transaction.
  // (The transaction used for SubmitChanges is committed to the DB.)
  // This is when the disk sync actually has to happen,
  // but it only happens once, not 3500 times!
  ts.Complete();
}

However, the semantics of an approach using a single transaction or a single call to SubmitChanges are different than that of the code above calling SubmitChanges 3500 times and creating 3500 different implicit transactions.但是,使用单个事务或单个调用 SubmitChanges 的方法的语义与上面调用 SubmitChanges 3500 次并创建 3500 个不同的隐式事务的代码的语义不同。 In particular, the size of the atomic operations (with respect to the database) is different and may not be suitable for all tasks.特别是,原子操作的大小(相对于数据库)是不同的,可能并不适合所有任务。

For LINQ2SQL updates, changing the optimistic concurrency model (disabling it or just using a timestamp field, for instance) may result in small performance improvements.对于 LINQ2SQL 更新,更改乐观并发 model(例如,禁用它或仅使用时间戳字段)可能会导致小的性能改进。 The biggest improvement, however, will come from reducing the number of commits that must be performed.然而,最大的改进将来自于减少必须执行的提交次数。

Happy coding.快乐编码。

i'm not positive on this, but it seems like the db.SubmitChanges() call should be made outside of the loop.我对此并不积极,但似乎db.SubmitChanges()调用应该在循环之外进行。 maybe that would speed things up?也许这会加快速度?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM