[英]How do I speed up my sqlite3 queries in Python?
I have an sqlite table with a few hundred million rows: 我有一个包含几亿行的sqlite表:
sqlite> create table t1(id INTEGER PRIMARY KEY,stuff TEXT );
I need to query this table by its integer primary key hundreds of millions of times. 我需要通过其整数主键查询该表数亿次。 My code:
我的代码:
conn = sqlite3.connect('stuff.db')
with conn:
cur = conn.cursor()
for id in ids:
try:
cur.execute("select stuff from t1 where rowid=?",[id])
stuff_tuple = cur.fetchone()
#do something with the fetched row
except:
pass #for when id is not in t1's key set
Here, ids is a list that may have tens of thousands of elements. 在此,id是一个可能包含成千上万个元素的列表。 Forming t1 did not take very long (ie ~75K inserts per second).
形成t1并不需要很长时间(即每秒插入约75K)。 Querying t1 the way I've done it is unacceptably slow (ie ~1K queries in 10 seconds).
用我做过的方式查询t1的速度实在令人难以接受(即10秒钟内约有1K次查询)。
I am completely new to SQL. 我对SQL完全陌生。 What am I doing wrong?
我究竟做错了什么?
Since you're retrieving values by their keys, it seems like a key/value store would be more appropriate in this case. 由于您要通过键检索值,因此在这种情况下,键/值存储似乎更合适。 Relational databases (Sqlite included) are definitely feature-rich, but you can't beat the performance of a simple key/value store.
关系数据库(包括Sqlite)肯定具有丰富的功能,但是您无法击败简单的键/值存储的性能。
There are several to choose from: 有几种可供选择:
And there's many, many more . 还有很多很多 。
You should make one sql call instead, should be must faster 您应该改用一个sql调用,应该更快
conn = sqlite3.connect('stuff.db')
with conn:
cur = conn.cursor()
for row in cur.execute("SELECT stuff FROM t1 WHERE rowid IN (%s)" % ','.join('?'*len(ids)), ids):
#do something with the fetched row
pass
you do not need a try except since ids not in the db will not show up. 您不需要尝试,因为不会显示数据库中没有的ID。 If you want to know which ids are not in the results, you can do:
如果您想知道结果中没有哪些ID,可以执行以下操作:
ids_res = set()
for row in c.execute(...):
ids_res.add(row['id'])
ids_not_found = ids_res.symmetric_difference(ids)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.