[英]Storing Data Locally in Python
I'm running a Scrapy project and I'm looking for the best method for storing already scraped data locally. 我正在运行Scrapy项目,我正在寻找在本地存储已经删除的数据的最佳方法。 Currently I'm using
AnyDBM
but i keep getting the following error after a while of running: 目前我正在使用
AnyDBM
但在运行一段时间后我一直收到以下错误:
bsddb.db.DBRunRecoveryError: (-30973, 'DB_RUNRECOVERY: Fatal error, run database recovery -- PANIC: fatal region error detected; run recovery')
It my be something I'm doing wrong as I'm pretty new to Python, but i was wondering if there is a better solution other than Anydbm anyway. 我是错误的,因为我对Python很陌生,但我想知道除了Anydbm之外是否有更好的解决方案。
I'm storing numeric ID's of the pages I have crawled, and will be storing around 500,000 records with plans for a possible 3-4 million for future projects. 我正在存储我已经抓取的页面的数字ID,并将存储大约500,000条记录,并计划为将来的项目提供3-4百万条记录。
Is AnyDBM what i should be sticking with or should i change to something more suitable for the job. AnyDBM是我应该坚持的,还是应该改为更适合这项工作的东西。
看起来非常适合sqlite
,已经是Python标准库的一部分。
By default python comes with sqlite3
which is a very good data base system. 默认情况下,python附带
sqlite3
,这是一个非常好的数据库系统。
Here is a pretty good tutorial on it. 这是一个非常好的教程 。 To put the table in memory use:
要将表放入内存使用:
conn = sqlite3.connect(":memory:")
conn.isolation_level = None
cur = conn.cursor()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.