简体   繁体   English

python peewee多处理池错误

[英]python peewee multiprocessing pool error

Stack: python3.4, PostgreSQL 9.4.7, peewee 2.8.0, psycopg2 2.6.1 (dt dec pq3 ext lo64) 堆栈:python3.4,PostgreSQL 9.4.7,peewee 2.8.0,psycopg2 2.6.1(dt dec pq3 ext lo64)

I have a need to be able to talk(select, inserts, update) to the postgresql database in each worker. 我需要能够与每个工作人员的postgresql数据库进行对话(选择,插入,更新)。 I am using pythons multiprocessing pool to create 10 workers and each one makes a curl call then talks to the database based on what it finds. 我正在使用pythons多处理池创建10个工作程序,每个工作程序进行curl调用,然后根据发现的内容与数据库进行对话。

After reading a few threads on the internets I thought a connection pool was the way to go. 在阅读了互联网上的一些线程之后,我认为连接池是必经之路。 So i placed the code below atop my models.py file. 因此,我将代码放在了我的models.py文件之上。 I have my doubts about connections pools because my understanding is that reusing database connections across threads is a no no. 我对连接池有疑问,因为我的理解是跨线程复用数据库连接是不行的。

db = PooledPostgresqlExtDatabase(
    'uc',
    max_connections=32,
    stale_timeout=300,  # 5 minutes.
    **{'password': cfg['psql']['pass'], 
       'port': cfg['psql']['port'], 
       'register_hstore':False,
       'host': cfg['psql']['host'], 
       'user': cfg['psql']['user']})

On to the question now. 现在到这个问题。 I am getting random sql errors when talking to the database from some workers. 从某些工作人员与数据库进行通讯时,出现随机SQL错误。 Before i introduced peewee into the mix i was using the "psycopg2" library without a wrapper. 在将peewee引入混合之前,我使用的是没有包装的“ psycopg2”库。 I was also creating a new database connection per worker. 我还在为每个工作人员创建一个新的数据库连接。 There were no errors. 没有错误。

A sample error that i get is: 我得到的一个示例错误是:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
    self.commit()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
    self.get_conn().commit()
psycopg2.DatabaseError: error with no message from the libpq

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.4/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/home/dan/dev/link-checker/crawler/manager.py", line 17, in startWorker
    wrk.perform()
  File "/home/dan/dev/link-checker/crawler/worker.py", line 49, in perform
    self.pullUrls()
  File "/home/dan/dev/link-checker/crawler/worker.py", line 63, in pullUrls
    newUrlDict = UrlManager.createUrlWithInProgress(self._url['crawl'], source_url, self._url['base'])
  File "/home/dan/dev/link-checker/crawler/models.py", line 152, in createUrlWithInProgress
    newUrl = Url.create(**newUrlDict)
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4494, in create
    inst.save(force_insert=True)
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4680, in save
    pk_from_cursor = self.insert(**field_dict).execute()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3213, in execute
    cursor = self._execute()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 2628, in _execute
    return self.database.execute_sql(sql, params, self.require_commit)
  File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
    self.commit()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3285, in __exit__
    reraise(new_type, new_type(*exc_args), traceback)
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 127, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
    self.commit()
  File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
    self.get_conn().commit()
peewee.DatabaseError: error with no message from the libpq

I also tailed the postgresql file and this is what i saw: 我还拖尾了postgresql文件,这就是我所看到的:

2016-04-19 20:34:23 EDT [26824-3] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-4] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-5] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-6] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-7] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-8] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-9] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-1] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-2] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-3] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-4] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-5] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-6] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-7] uc_user@uc WARNING:  there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-8] uc_user@uc WARNING:  there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-9] uc_user@uc WARNING:  there is no transaction in progress

My hunch is that the connection pool and the multiprocessing don't go well together. 我的直觉是连接池和多处理不能很好地结合在一起。 Has anyone done this successfully without errors and if so, can you point me to an example or give me a piece of advice that works? 有没有人成功地做到了这一点而没有错误;如果是这样,您能给我一个例子或给我一些可行的建议吗?

Do i need to explicitly create a new connection with peewee inside my worker or is there an easier way to use peewee with the multiprocessing pool library. 我是否需要在工作进程中显式创建与peewee的新连接,或者有更简单的方法将peewee与多处理池库一起使用。

Thanks for your answers and for reading. 感谢您的回答和阅读。

I got it working, all the code in the models.py file that was going to be used by the workers. 我开始工作了,models.py文件中的所有代码都将被工作人员使用。 I wrapped it in "with db.execution_context as ctx" as described on this page: 我将其包装在“使用db.execution_context作为ctx”中,如本页所示:

http://docs.peewee-orm.com/en/latest/peewee/database.html#advanced-connection-management http://docs.peewee-orm.com/zh-CN/latest/peewee/database.html#advanced-connection-management

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM