简体   繁体   English

使用 Python 从 AWS Lambda 中的 BigQuery 客户端提取 JSON 对象

[英]Extracting JSON object from BigQuery Client in AWS Lambda using Python

I am running a SQL query via the google.cloud.bigquery.Client.query package in AWS lambda (Python 2.7 runtime).我正在通过 AWS lambda(Python 2.7 运行时)中的google.cloud.bigquery.Client.query包运行 SQL 查询。 The native BQ object extracted from a query is the BigQuery Row() ie,从查询中提取的原生 BQ 对象是BigQuery Row()即,

Row((u'exampleEmail@gmail.com', u'XXX1234XXX'), {u'email': 0, u'email_id': 1})行((u'exampleEmail@gmail.com', u'XXX1234XXX'), {u'email': 0, u'email_id': 1})

I need to convert this to Json, ie,我需要将其转换为 Json,即

[{'email_id': 'XXX1234XXX', 'email': 'exampleEmail@gmail.com'}] [{'email_id': 'XXX1234XXX', 'email': 'exampleEmail@gmail.com'}]

When running locally, I am able to just call the python Dict function on the row to transform it, ie,在本地运行时,我可以在行上调用 python Dict 函数来转换它,即,

queryJob = bigquery.Client.query(sql)
list=[]
for row in queryJob.result():
    ** at this point row = the BQ sample Row object shown above **
    tmp = dict(row)
    list.append(tmp)`

but when I load this into AWS Lambda it throws the error:但是当我将其加载到 AWS Lambda 中时,它会引发错误:

ValueError: dictionary update sequence element #0 has length 22;值错误:字典更新序列元素 #0 的长度为 22; 2 is required 2 是必需的

I have tried forcing it in different ways, breaking it out into sections etc but cannot get this into the JSON format desired.我尝试以不同的方式强制它,将其分成多个部分等,但无法将其转换为所需的 JSON 格式。

I took a brief dive into the rabbit hole of transforming the QueryJob into a Pandas dataframe and then from there into a JSON object, which also works locally but runs into numpy package errors in AWS Lambda which seems to be a bit of a known issue.我简要介绍了将 QueryJob 转换为 Pandas 数据帧,然后从那里转换为 JSON 对象的问题,该对象也可在本地运行,但在 AWS Lambda 中遇到numpy包错误,这似乎是一个已知问题。

I feel like this should have an easy solution but just haven't found it yet.我觉得这应该有一个简单的解决方案,但还没有找到。

Try doing it like this尝试这样做

` `

L = []
sql = (#sql_statement)
query_job = client.query(sql)  # API request
query_job.result()
for row in query_job:
    email_id= row.get('email_id')
    email= row.get('email')
    L.append([email_id, email])

` `

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM