I'm trying to execute some long-running SQL queries using SQLAlchemy against a Postgres database hosted on AWS RDS.
from sqlalchemy import create_engine
conn_str = 'postgresql://user:password@db-primary.cluster-cxf.us-west-2.rds.amazonaws.com:5432/dev'
engine = create_engine(conn_str)
sql = 'UPDATE "Clients" SET "Name" = NULL'
#this takes about 4 hrs to execute if run in pgAdmin
with engine.begin() as conn:
conn.execute(sql)
After running for exactly 2 hours, the script errors out with
OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
(Background on this error at: https://sqlalche.me/e/14/e3q8)
I have tested setting connection timeouts in SQLAlchemy (based on How to set connection timeout in SQLAlchemy ). This did not make a difference.
I have looked up the connection settings in the Postgres settings (based on https://dba.stackexchange.com/questions/164419/is-it-possible-to-limit-timeout-on-postgres-server ), but both statement_timeout
and idle_in_transaction_session_timeout
are set to 0, meaning there are no set limits.
I agree with @jjanes. This smells like a TCP connection timeout issue. Might be that somewhere in the.network layer something, be it a NAT or a firewall, dropped your TCP connection, leaving the code to wait for the full TCP keepalive timeout until it sees the connection as closed. This could happen usually when the.network topology between the client and the database is complicated. For example there may be a company firewall, or some sort of interconnection. pgAdmin may come with a pre-configured setting for TCP keepalive, therefore it was not impacted, but I'm not sure.
Other timeouts didn't kick in because, in my understanding, TCP timeout is in the L4 layer, which overshadows other timeouts that are in L7 application layer.
You could try adding the keepalive parameters into your connection string and see if it can resolve the issue. For example:
postgresql://user:password@db-primary.cluster-cxf.us-west-2.rds.amazonaws.com:5432/dev?keepalives_idle=1&keepalives_count=1&tcp_user_timeout=1000
Note the keepalive parameters at the end. For your reference, here's the explanation to those parameters: https://www.postgresql.org/docs/current/runtime-config-connection.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.