[英]PostgresOperator in Airflow getting timeout
I made a function in Postgres that have the following statement: 我在Postgres中创建了一个函数,它有以下声明:
FUNCTION
SET statement_timeout TO "3600s"
SELECT * FROM schema.table_name
END
FUNCTION
In Airflow I use the PostgresOperator
to execute this function, but I receive the message [2018-06-01 00:00:01,066] {models.py:1595} ERROR - canceling statement due to statement timeout
. 在Airflow中我使用
PostgresOperator
来执行此功能,但是[2018-06-01 00:00:01,066] {models.py:1595} ERROR - canceling statement due to statement timeout
,我收到消息[2018-06-01 00:00:01,066] {models.py:1595} ERROR - canceling statement due to statement timeout
。
I saw that PostgresOperator
uses the postgres_hook
, and postgres_hook
uses the psycopg2
as connector. 我看到
PostgresOperator
使用postgres_hook
,而postgres_hook
使用psycopg2
作为连接器。
As I see, I can be a timeout by a cli application instead a timeout from the database. 正如我所见,我可以通过cli应用程序超时,而不是数据库中的超时。
I would like to know how to solve this thing? 我想知道如何解决这个问题? Do I need to configure the Psycopg in Airflow or can I use some environmental variables to set the timeout to avoid this problem?
我是否需要在Airflow中配置Psycopg,还是可以使用一些环境变量来设置超时以避免此问题?
You can pass in connection arguments into psycopg2 library through the Airflow extras
property on connection. 您可以通过连接上的Airflow
extras
属性将连接参数传递到psycopg2库。 At the time of writing the postgres_hook supports the following arguments 在撰写本文时,postgres_hook支持以下参数
['sslmode', 'sslcert', 'sslkey','sslrootcert', 'sslcrl', 'application_name', 'keepalives_idle']
In order to pass in the statement_timeout
argument to the PostgresHook you will need to override the get_conn
of the PostgresHook to accept your desired argument. 为了将
statement_timeout
参数传递给PostgresHook,您需要覆盖PostgresHook的get_conn
以接受您想要的参数。
Ex. 防爆。 Class Method Override
类方法覆盖
class NewPostgresHook(PostgresHook):
def __init__(self, *args, **kwargs):
super(NewPostgresHook, self).__init__(*args, **kwargs)
def get_conn(self):
conn = self.get_connection(self.postgres_conn_id)
conn_args = dict(
host=conn.host,
user=conn.login,
password=conn.password,
dbname=self.schema or conn.schema,
port=conn.port)
# check for ssl parameters in conn.extra
for arg_name, arg_val in conn.extra_dejson.items():
if arg_name in ['sslmode', 'sslcert', 'sslkey',
'sslrootcert', 'sslcrl', 'application_name',
'keepalives_idle', 'statement_timeout']:
conn_args[arg_name] = arg_val
self.conn = psycopg2.connect(**conn_args)
return self.conn
You can then specify this argument on the connection extras
field in the form of a JSON string. 然后,您可以以JSON字符串的形式在connection
extras
字段中指定此参数。
Ex. 防爆。 JSON String in Connection Extras Field
连接附加字段中的JSON字符串
{'statement_timeout': '3600s'}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.