简体   繁体   English

如何规避错误 pq_flush: could not send data to client: Broken pipe found

[英]How to circumvent error pq_flush: could not send data to client: Broken pipe found

I am trying to create an asynchronous process using Lambda functions that would look somewhat like this:我正在尝试使用看起来像这样的 Lambda 函数创建一个异步进程:

  • Lambda 1 fires a query on my redshift cluster and ends Lambda 1 在我的 redshift 集群上触发查询并结束
  • Lambda 2 polls the cluster for the query status and would end/succeed based on results Lambda 2 轮询集群以获取查询状态,并将根据结果结束/成功

I have used several different options, but they all seem to fail at one point.我使用了几种不同的选项,但它们似乎都在某一时刻失败了。 I can create a query, and fire it, and have the lambda end, but when the query completes execution instead of succeeding it complains that the client connection no longer exists我可以创建一个查询并触发它,并让 lambda 结束,但是当查询完成执行而不是成功时,它抱怨客户端连接不再存在

error pq_flush: could not send data to client: Broken pipe found in xyz

The problem is that this is completely expected for my use case.问题是这完全符合我的用例的预期。 I don't want the client (Lambda 1) to wait around because my query could take an hour to run (exaggerating but possible) which is why I created a second lambda.我不希望客户端(Lambda 1)等待,因为我的查询可能需要一个小时才能运行(夸张但可能),这就是我创建第二个 lambda 的原因。 Is there a way I can communicate this to Redshift/postgresql and circumvent this issue?有没有办法可以将它传达给 Redshift/postgresql 并规避这个问题?

Here is my triggering code (will eventually go to lambda but I am testing on my local machine)这是我的触发代码(最终将 go 到 lambda 但我正在本地机器上进行测试)

import select
import psycopg2

def wait(conn):
  while True:
    state = conn.poll()
    if state == psycopg2.extensions.POLL_OK:
      break
    elif state == psycopg2.extensions.POLL_WRITE:
      select.select([], [conn.fileno()], [])
    elif state == psycopg2.extensions.POLL_READ:
      select.select([conn.fileno()], [], [])
    else:
      raise psycopg2.OperationalError("poll() returned %s" % state)

conn = psycopg2.connect(
  user='someuser',
  dbname='somedb',
  host='myredshiftcluster',
  port=5432,
  password='somepassword',
  async_=1,
  sslmode="require"
)
wait(conn)
acurs = conn.cursor()
acurs.execute('call public.test_sp(\'xyz\')')

ODBC and JDBC connections are synchronous so building an asynchronous process around then will not work. ODBC 和 JDBC 连接是同步的,因此在那时构建异步进程将不起作用。 Luckily AWS announced Redshift Data API recently which is an asynchronous REST interface.幸运的是,AWS 最近宣布了 Redshift Data API,这是一个异步 REST 接口。 So you can perform what you are looking for through that method.因此,您可以通过该方法执行您正在寻找的内容。

See: https://docs.aws.amazon.com/redshift/latest/mgmt/data-api.html请参阅: https://docs.aws.amazon.com/redshift/latest/mgmt/data-api.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM