简体   繁体   English

使用许多记录高效更新SQLite表

[英]Efficient update of SQLite table with many records

I am trying to use sqlite (sqlite3) for a project to store hundreds of thousands of records (would like sqlite so users of the program don't have to run a [my]sql server). 我正在尝试使用sqlite(sqlite3)来存储数十万条记录的项目(想要sqlite,所以程序的用户不必运行[my] sql server)。

I have to update hundreds of thousands of records sometimes to enter left right values (they are hierarchical), but have found the standard 我有时需要更新数十万条记录来输入左边的值(它们是分层的),但已经找到了标准

update table set left_value = 4, right_value = 5 where id = 12340;

to be very slow. 非常慢 I have tried surrounding every thousand or so with 我已经尝试过每隔一千左右

begin;
....
update...
update table set left_value = 4, right_value = 5 where id = 12340;
update...
....
commit;

but again, very slow. 但又一次,很慢。 Odd, because when I populate it with a few hundred thousand (with inserts), it finishes in seconds. 奇怪,因为当我填充数十万(带插入)时,它会在几秒钟内完成。

I am currently trying to test the speed in python (the slowness is at the command line and python) before I move it to the C++ implementation, but right now this is way to slow and I need to find a new solution unless I am doing something wrong. 我目前正试图在将它移动到C ++实现之前测试python中的速度(缓慢在命令行和python中),但是现在这种方法很慢,我需要找到一个新的解决方案,除非我正在做有问题。 Thoughts? 思考? (would take open source alternative to SQLite that is portable as well) (将采用可移植的SQLite的开源替代方案)

table.id上创建索引

create index table_id_index on table(id)

Other than making sure you have an index in place, you can checkout the SQLite Optimization FAQ . 除了确保您有索引之外,您还可以查看SQLite优化常见问题解答

Using transactions can give you a very big speed increase as you mentioned and you can also try to turn off journaling. 如上所述,使用事务可以为您提供非常大的速度提升,并且您还可以尝试关闭日记功能。

Example 1: 例1:

2.2 PRAGMA synchronous 2.2 PRAGMA同步

The Boolean synchronous value controls whether or not the library will wait for disk writes to be fully written to disk before continuing. 布尔同步值控制库是否等待磁盘写入完全写入磁盘,然后再继续。 This setting can be different from the default_synchronous value loaded from the database. 此设置可以与从数据库加载的default_synchronous值不同。 In typical use the library may spend a lot of time just waiting on the file system. 在典型的使用中,库可能会花费大量时间等待文件系统。 Setting "PRAGMA synchronous=OFF" can make a major speed difference. 设置“PRAGMA synchronous = OFF”可以产生很大的速度差异。

Example 2: 例2:

2.3 PRAGMA count_changes 2.3 PRAGMA count_changes

When the count_changes setting is ON, the callback function is invoked once for each DELETE, INSERT, or UPDATE operation. 当count_changes设置为ON时,对每个DELETE,INSERT或UPDATE操作调用一次回调函数。 The argument is the number of rows that were changed. 参数是已更改的行数。 If you don't use this feature, there is a small speed increase from turning this off. 如果您不使用此功能,则关闭此功能会略微提高速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM