[英]peewee multiprocess from separate db connection 'peewee.OperationalError: disk I/O error'
I am trying to use multiprocessing in order to run a CPU-intensive job in the background.我正在尝试使用多处理来在后台运行 CPU 密集型作业。 I'd like this process to be able to use peewee ORM to write its results to the SQLite database.
我希望这个过程能够使用 peewee ORM 将其结果写入 SQLite 数据库。
In order to do so, I am trying to override the Meta.database of my model class after thread creation so that I can have a separate db connection for my new process.为此,我试图在线程创建后覆盖模型类的 Meta.database,以便我可以为我的新进程建立一个单独的数据库连接。
def get_db():
db = SqliteExtDatabase(path)
return db
class BaseModel(Model):
class Meta:
database = get_db()
# Many other models
class Batch(BaseModel):
def multi():
def background_proc():
# trying to override Meta's db connection.
BaseModel._meta.database = get_db()
job = Job.get_by_id(1)
print("working in the background")
process = multiprocessing.Process(target=background_proc)
process.start()
Error when executing my_batch.multi()
执行
my_batch.multi()
时出错
Process Process-1:
Traceback (most recent call last):
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 3099, in execute_sql
cursor.execute(sql, params or ())
sqlite3.OperationalError: disk I/O error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/layne/.pyenv/versions/3.7.6/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/Users/layne/.pyenv/versions/3.7.6/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/Users/layne/Desktop/pydatasci/pydatasci/aidb/__init__.py", line 1249, in background_proc
job = Job.get_by_id(1)
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 6395, in get_by_id
return cls.get(cls._meta.primary_key == pk)
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 6384, in get
return sq.get()
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 6807, in get
return clone.execute(database)[0]
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 1886, in inner
return method(self, database, *args, **kwargs)
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 1957, in execute
return self._execute(database)
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 2129, in _execute
cursor = database.execute(self)
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 3112, in execute
return self.execute_sql(sql, params, commit=commit)
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 3106, in execute_sql
self.commit()
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 2873, in __exit__
reraise(new_type, new_type(exc_value, *exc_args), traceback)
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 183, in reraise
raise value.with_traceback(tb)
File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 3099, in execute_sql
cursor.execute(sql, params or ())
peewee.OperationalError: disk I/O error
I got this working using threads instead, but it's hard to actually terminate a thread (not just break from a loop) and CPU-intensive (not io delayed) jobs should be multiprocessed.我使用线程来完成这项工作,但实际上很难终止线程(不仅仅是从循环中中断)并且 CPU 密集型(不是 io 延迟)作业应该是多处理的。
UPDATE: looking into peewee proxy http://docs.peewee-orm.com/en/latest/peewee/database.html#dynamically-defining-a-database更新:查看 peewee 代理http://docs.peewee-orm.com/en/latest/peewee/database.html#dynamically-defining-a-database
I believe the problem was that:我认为问题在于:
Within the separate process, I was not closing the existing connection before attempting to replace it with a new connection.在单独的过程中,在尝试用新连接替换现有连接之前,我没有关闭现有连接。
def background_proc():
db = BaseModel._meta.database
db.close() #<----- this
BaseModel._meta.database = get_db()
This works and I can continue to use the original connection on my main process (or whatever the non-multiprocess called).这有效,我可以继续在我的主进程(或任何非多进程调用)上使用原始连接。
Maybe init DB Object in each process will help you.也许每个进程中的 init DB Object 会帮助你。
def get_db():
db = SqliteExtDatabase(path)
return db
class BaseModel(Model):
def __init__(self, database, **kwargs):
self.database = database
# Many other models
class Batch(BaseModel):
def multi():
def background_proc():
# trying to override Meta's db connection.
db = get_db()
basemodel = BaseModel(db)
# do something like "basemodel.insert(name="Alex")"
job = Job(db)
result = job.get_by_id(1)
print("result")
print("working in the background")
process = multiprocessing.Process(target=background_proc)
process.start()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.