简体   繁体   English

peewee 多进程来自单独的数据库连接“peewee.OperationalError: 磁盘 I/O 错误”

[英]peewee multiprocess from separate db connection 'peewee.OperationalError: disk I/O error'

I am trying to use multiprocessing in order to run a CPU-intensive job in the background.我正在尝试使用多处理来在后台运行 CPU 密集型作业。 I'd like this process to be able to use peewee ORM to write its results to the SQLite database.我希望这个过程能够使用 peewee ORM 将其结果写入 SQLite 数据库。

In order to do so, I am trying to override the Meta.database of my model class after thread creation so that I can have a separate db connection for my new process.为此,我试图在线程创建后覆盖模型类的 Meta.database,以便我可以为我的新进程建立一个单独的数据库连接。

def get_db():
    db = SqliteExtDatabase(path)
    return db

class BaseModel(Model):
    class Meta:
        database = get_db()

# Many other models

class Batch(BaseModel):
    
    def multi():
        def background_proc():
            # trying to override Meta's db connection.
            BaseModel._meta.database = get_db()
            job = Job.get_by_id(1)
            print("working in the background")
        
        process = multiprocessing.Process(target=background_proc)
        process.start()

Error when executing my_batch.multi()执行my_batch.multi()时出错

Process Process-1:
Traceback (most recent call last):
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 3099, in execute_sql
    cursor.execute(sql, params or ())
sqlite3.OperationalError: disk I/O error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/layne/.pyenv/versions/3.7.6/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/Users/layne/.pyenv/versions/3.7.6/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/layne/Desktop/pydatasci/pydatasci/aidb/__init__.py", line 1249, in background_proc
    job = Job.get_by_id(1)
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 6395, in get_by_id
    return cls.get(cls._meta.primary_key == pk)
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 6384, in get
    return sq.get()
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 6807, in get
    return clone.execute(database)[0]
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 1886, in inner
    return method(self, database, *args, **kwargs)
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 1957, in execute
    return self._execute(database)
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 2129, in _execute
    cursor = database.execute(self)
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 3112, in execute
    return self.execute_sql(sql, params, commit=commit)
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 3106, in execute_sql
    self.commit()
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 2873, in __exit__
    reraise(new_type, new_type(exc_value, *exc_args), traceback)
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 183, in reraise
    raise value.with_traceback(tb)
  File "/Users/layne/.pyenv/versions/3.7.6/envs/jupyterlab/lib/python3.7/site-packages/peewee.py", line 3099, in execute_sql
    cursor.execute(sql, params or ())
peewee.OperationalError: disk I/O error

I got this working using threads instead, but it's hard to actually terminate a thread (not just break from a loop) and CPU-intensive (not io delayed) jobs should be multiprocessed.我使用线程来完成这项工作,但实际上很难终止线程(不仅仅是从循环中中断)并且 CPU 密集型(不是 io 延迟)作业应该是多处理的。

UPDATE: looking into peewee proxy http://docs.peewee-orm.com/en/latest/peewee/database.html#dynamically-defining-a-database更新:查看 peewee 代理http://docs.peewee-orm.com/en/latest/peewee/database.html#dynamically-defining-a-database

I believe the problem was that:我认为问题在于:

Within the separate process, I was not closing the existing connection before attempting to replace it with a new connection.在单独的过程中,在尝试用新连接替换现有连接之前,我没有关闭现有连接。

def background_proc():
    db = BaseModel._meta.database
    db.close() #<----- this
    BaseModel._meta.database = get_db()

This works and I can continue to use the original connection on my main process (or whatever the non-multiprocess called).这有效,我可以继续在我的主进程(或任何非多进程调用)上使用原始连接。

Maybe init DB Object in each process will help you.也许每个进程中的 init DB Object 会帮助你。

def get_db():
    db = SqliteExtDatabase(path)
    return db

class BaseModel(Model):

    def __init__(self, database, **kwargs):
        self.database = database

# Many other models

class Batch(BaseModel):
    
    def multi():
        def background_proc():
            # trying to override Meta's db connection.
            db = get_db()
            basemodel = BaseModel(db)
            # do something like "basemodel.insert(name="Alex")"
            job = Job(db)
            result = job.get_by_id(1)
            print("result")
            print("working in the background")
        
        process = multiprocessing.Process(target=background_proc)
        process.start()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 peewee.OperationalError:在“ AS”附近:语法错误 - peewee.OperationalError: near “AS”: syntax error 为什么我会收到 peewee.OperationalError - Why am i getting peewee.OperationalError peewee.OperationalError:没有这样的 function:json - peewee.OperationalError: no such function: json Flask:peewee.OperationalError:没有这样的表: - Flask: peewee.OperationalError: no such table: 在python unittest中,保存Peewee对象的实例会引发peewee.IntegrityError:和peewee.OperationalError: - In python unittest, saving an instance of Peewee object raises peewee.IntegrityError: and peewee.OperationalError: peewee.OperationalError:表要点没有名为name的列 - peewee.OperationalError: table gist has no column named name peewee.OperationalError:由于未完成的语句或未完成的备份而无法关闭 - peewee.OperationalError: unable to close due to unfinalized statements or unfinished backups peewee.OperationalError:只有150行* 8列的upsert上有太多的SQL变量 - peewee.OperationalError: too many SQL variables on upsert of only 150 rows * 8 columns peewee.OperationalError:(3995,“字符集‘utf8mb4_unicode_ci’不能与调用 regexp_like 的‘binary’结合使用。”) - peewee.OperationalError: (3995, "Character set 'utf8mb4_unicode_ci' cannot be used in conjunction with 'binary' in call to regexp_like.") Jupyter Notebook: (OperationalError(&#39;磁盘 I/O 错误&#39;,)) - Jupyter Notebook: (OperationalError('disk I/O error',))
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM