简体   繁体   English

Sqlite executemany和DELETE

[英]Sqlite executemany and DELETE

execute many seems to be very slow with deletion (Insertion is fine) and I was wondering if anyone knew why it took so long 执行很多似乎是非常慢的删除(插入很好),我想知道是否有人知道为什么花了这么长时间

consider the code below 考虑下面的代码

import sqlite3

db = sqlite3.connect("mydb")
c = db.cursor()
c.execute("DROP TABLE IF EXISTS testing ")
c.execute("CREATE TABLE testing (val INTEGER);")
my_vals2 = [[x] for x in range(1,10000)]

def insertmany(vals):
    c.executemany("INSERT INTO testing (val) VALUES (?)",vals)
    db.commit()

def deletemany1(vals):
    c.executemany("DELETE FROM testing WHERE val=?",vals)
    db.commit()

def deletemany2(vals): #this is fastest even though im looping over to convert to strings and again to join ...
    vals = ["'%s'"%v[0] for v in vals] 
    c.execute("DELETE FROM testing WHERE val IN (%s)"%",".join(vals))
    #DELETE FROM TABLE WHERE x in (1,2,3...)

And The following time results (timeit was giving funny data so :/) from ipython 并且以下时间结果(timeit正在给出有趣的数据:/)来自ipython

%time insertmany(my_vals2) 
#CPU times: user 0.60 s, sys: 0.00 s, total: 0.60 s Wall time: 0.60 s

%time deletemany1(my_vals2)
#CPU times: user 3.58 s, sys: 0.00 s, total: 3.58 s Wall time: 3.58 s

%time deletemany2(my_vals2)
#CPU times: user 0.02 s, sys: 0.00 s, total: 0.02 s Wall time: 0.02 s

And just for sake of completeness here is the timeit results (but i think timeit is broken on second test(that or the ms is a different unit then the first test)) 并且为了完整性,这里是timeit结果(但我认为timeit在第二次测试中被破坏(或者ms是与第一次测试不同的单位))

%timeit insertmany(my_vals2) 
#1 loops, best of 3: 358 ms per loop

%timeit deletemany1(my_vals2)
#1 loops, best of 3: 8.34 ms per loop  <- this is not faster than the above!!!! (timeit lies?)

%timeit deletemany2(my_vals2)
#100 loops, best of 3: 2.3 ms per loop  

So why is executemany soooooo slow with deletes ? 那么为什么executemany soooooo删除速度慢?

I'm just taking a punt: Because it has to search exhaustively for the ones to delete. 我只是采取一个平底船:因为它必须彻底搜索要删除的那些。 Try it with an index and report back. 尝试使用索引并报告。

CREATE INDEX foo ON testing (val)

http://sqlite.org/lang_createindex.html http://sqlite.org/lang_createindex.html

SQLites stores table records in a B+ tree, sorted by rowid . SQLites将表记录存储在B +树中,按rowid排序。

When you are inserting with an automatically generated rowid , all records are just appended at the end of the table. 使用自动生成的rowid插入时,所有记录都只是附加在表的末尾。 However, when deleting, SQLite has to search for the record first. 但是,删除时,SQLite必须首先搜索记录。 This is slow if the id column is not indexed; 如果id列未编入索引,则速度很慢; either create an explicit index (as proposed by John), or declare the column as INTEGER PRIMARY KEY to make it the rowid. 或者创建一个显式索引(由John提出),或者将该列声明为INTEGER PRIMARY KEY以使其成为rowid。

Inserting with an index becomes faster if you don't use the index, ie, if you create the index only after bulk inserts. 如果不使用索引,则插入索引会更快,即,只有在批量插入后才创建索引。

Your last delete command deletes all records at once. 您的上一个删除命令一次删除所有记录。 If you know that you're deleting all records in the table, you could speed it up even further by using just DELETE FROM testing , which doesn't need to look at any records at all. 如果您知道要删除表中的所有记录,则可以通过仅使用DELETE FROM testing来进一步加快速度,这不需要查看任何记录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM