[英]Upsert multiple rows in PostgreSQL with psycopg2 and errors logging
I'm writing an application that connects to database and upserts multiple rows, it creates SAVEPOINT for every row, so I can rollback without breaking a transaction, if there is a mistake, and commits every 500 rows. 我正在编写一个连接数据库并向上插入多行的应用程序,它为每一行创建SAVEPOINT,因此,如果有错误,我可以回滚而不中断事务,并且每500行提交一次。
The problem is that it works extremely slow for remote database connections (postgresql database on DigitalOcean droplet) - it took about 35 minutes to process 1000 rows, when it was only 7 second with local database (which is also not quite fast, but ok). 问题在于,它对于远程数据库连接(DigitalOcean Droplet上的Postgresql数据库)的工作非常慢-处理本地行仅花费7秒时,大约需要35分钟来处理1000行(这也不太快,但是还可以) 。
I found post about upserting using one cursor.execute(), like here , but how should I catch errors if using this trick? 我发现后如何使用一个cursor.execute()upserting,喜欢这里 ,但我应该如何确定使用此招捕获错误? Or what else should I do to make it work faster?
还是我应该做些什么来使其更快地工作? Here is my code:
这是我的代码:
self.connection = psycopg2.connect(self.connection_settings)
self.cursor = self.connection.cursor()
for record in dbf_file:
self.cursor.execute("SAVEPOINT savepoint;")
try:
self.send_record(record, where_to_save=database)
self.count += 1
self.batch_count += 1
if self.batch_count >= BATCH_COUNT_MAX:
self.connection.commit()
self.cursor.close()
self.cursor = self.connection.cursor()
self.batch_count = 0
except Exception:
self.cursor.execute("ROLLBACK TO SAVEPOINT savepoint;")
self.save_error(traceback.format_exc())
self.error_count += 1
self.batch_count += 1
if self.batch_count == BATCH_COUNT_MAX:
self.connection.commit()
self.cursor.close()
self.cursor = self.connection.cursor()
self.batch_count = 0
else:
if self.batch_count != 0:
Given you are already using files, I'd suggest: 鉴于您已经在使用文件,我建议:
copy_from()
copy_from()
This way you will likely eliminate unneeded network overhead. 这样,您可能会消除不必要的网络开销。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.