简体   繁体   中英

How to append different PubSub objects and flatten them to write them altogether into bigquery as a single JSON?

I wanted to write three attributes (data, attributes and publish time) of a Pub/Sub message to Bigquery and wanted them to print in a flattened way so that all elements writes in a single row, for example:

data[0] data[1] attr[0] attr[0] key publishTime
data data attr attr key publishTime

I'm currently using the following piece of code for decoding and parsing the message but this is applicable only for the data part of the Pub/Sub message:

 class decodeMessage: def decode_base64(self,element): """Decode base64, padding being optional.""" return json.dumps(element.data.decode("utf-8")) class parseMessage: def parseJsonMessage(self,element): return(json.loads(element))

I've also tried merging two json after dumping them from Json objects to Json string but it didn't go as planned, my ultimate goal is to bring all columns into a single JSON with the schema retained.

I hope my question remains clear to you! Thanks!

The solution to the following problem is to simply make a Python dictionary and append all the data into a new Dictionary.

example:

    payload = dict()
    data = json.dumps(element.data.decode('utf-8'))
    attributes = json.dumps(element.attributes)
    messageKey = element.message_id
    publish_time = (element.publish_time).timestamp()*1000
    
    payload['et'] = publish_time
    payload['data'] = data
    payload['attributes'] = attributes
    payload['key'] = messageKey
    
    return (payload)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM