简体   繁体   中英

PUB/SUB To Bigquery Without Using DataFlow

I would like to insert into bigquery tables data using pubsub. The data has been processed so I do not need dataflow. How can I do this? Thanks in advance

Cloud Pub/Sub is a queue service, imagine like a database.

You still need something between Cloud Pub/Sub and Bigquery which executes the jobs that are waiting in the queue. For this people often use DataFlow, but you can implement your own worker to read from Pub/Sub and write to BigQuery.

Now feature called BigQuery subscriptions allows the pubsub subscription to write to Bigquery directly without using dataflow jobs. You won't get billed any additional charges to write the data to bigquery.

在此处输入图像描述

Pubsub stores messages, and they are meant to be read by clients to be processed. Instead of using Pubsub, you can modify your your application to insert directly into BigQuery because there are nothing to process in the messages. This link contains examples for all the available clients for streaming loading. For batch check this one .

However, if you still need to dump your messages there is a beta Pub/Sub to BigQuery template that could do that. Be aware that it is an intermediate process dedicated to transfer structured data. Basically there are two requirements:

  • Pubsub messages has to be in simple JSON format, for example {"k1":"v1", "k2":"v2"}. And then, your data will inserted in columns k1 and k2.
  • The table should exist in BigQuery prior the execution.

Check the link of the template for complete instructions.

Hope this information answer your question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM