简体   繁体   English

如何 append 不同的 PubSub 对象并将它们展平以将它们作为单个 JSON 一起写入 bigquery?

[英]How to append different PubSub objects and flatten them to write them altogether into bigquery as a single JSON?

I wanted to write three attributes (data, attributes and publish time) of a Pub/Sub message to Bigquery and wanted them to print in a flattened way so that all elements writes in a single row, for example:我想将 Pub/Sub 消息的三个属性(数据、属性和发布时间)写入 Bigquery,并希望它们以扁平化的方式打印,以便所有元素写入一行,例如:

data[0]数据[0] data[1]数据[1] attr[0]属性[0] attr[0]属性[0] key钥匙 publishTime发布时间
data数据 data数据 attr属性 attr属性 key钥匙 publishTime发布时间

I'm currently using the following piece of code for decoding and parsing the message but this is applicable only for the data part of the Pub/Sub message:我目前正在使用以下代码来解码和解析消息,但这仅适用于 Pub/Sub 消息的数据部分:

 class decodeMessage: def decode_base64(self,element): """Decode base64, padding being optional.""" return json.dumps(element.data.decode("utf-8")) class parseMessage: def parseJsonMessage(self,element): return(json.loads(element))

I've also tried merging two json after dumping them from Json objects to Json string but it didn't go as planned, my ultimate goal is to bring all columns into a single JSON with the schema retained.在将两个 json 从 Json 对象转储到 Json 字符串后,我也尝试合并它们,但它没有按计划将 go 合并,我的最终目标是将所有列合并为一个 JSON 并保留架构。

I hope my question remains clear to you!我希望你能清楚我的问题! Thanks!谢谢!

The solution to the following problem is to simply make a Python dictionary and append all the data into a new Dictionary.下面这个问题的解决方法就是简单的把一个Python的字典和append的所有数据做成一个新的Dictionary。

example:例子:

    payload = dict()
    data = json.dumps(element.data.decode('utf-8'))
    attributes = json.dumps(element.attributes)
    messageKey = element.message_id
    publish_time = (element.publish_time).timestamp()*1000
    
    payload['et'] = publish_time
    payload['data'] = data
    payload['attributes'] = attributes
    payload['key'] = messageKey
    
    return (payload)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 python 处理我的 PubSub 消息 Object 并将所有对象写入 Apache Beam 中的 BigQuery? - How to Process my PubSub Message Object and Write all objects into BigQuery in Apache Beam using python? 如何在bigquery中将具有不同键数的大型JSON字符串的列展平到表中 - How to flatten a colum of large JSON strings with different numbers of keys to a table in bigquery Bigquery 到 PubSub - Bigquery to PubSub 如何将数组结构展平为 Google BigQuery 中的列 - How to flatten an Array Struct to columns in Google BigQuery BigQuery Storage API:追加/写入操作的原子性 - BigQuery Storage API: Atomicity of an append/write operation 使用 Golang 读取 Google Cloud Pubsub 消息并写入 BigQuery - Read Google Cloud Pubsub message and write to BigQuery using Golang gcp pubsub 到 bigquery 订阅 - gcp pubsub to bigquery subscription 如何将现有的 PubSub 订阅与 Google 提供的 PubSub 结合使用到 BigQuery 数据流模板 - How to use existing PubSub Subscription with Google-Provided PubSub to BigQuery Dataflow Template 在 bigquery 上展平数组 - Flatten an array on bigquery Spring 批处理 - 读取字节 stream,处理,写入 2 个不同的 csv 文件,将它们转换为输入 stream 并将其存储到 ECS,然后写入数据库 - Spring Batch - Read a byte stream, process, write to 2 different csv files convert them to Input stream and store it to ECS and then write to Database
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM