简体   繁体   English

如何将现有的 PubSub 订阅与 Google 提供的 PubSub 结合使用到 BigQuery 数据流模板

[英]How to use existing PubSub Subscription with Google-Provided PubSub to BigQuery Dataflow Template

I am trying to setup a Dataflow job using the google provided template PubSub to BigQuery .我正在尝试使用谷歌提供的模板PubSub to BigQuery设置数据流作业。 I see an option to specify the Cloud Pub/Sub input topic but I don't see any option to specify Pub/Sub input subscription in GCP console UI.我看到一个用于指定 Cloud Pub/Sub 输入主题的选项,但我没有看到任何用于在 GCP 控制台 UI 中指定 Pub/Sub 输入订阅的选项。

If I provide the topic, job would automatically create a subscription to read the messages from the provided topic.如果我提供主题,作业会自动创建一个订阅以读取来自所提供主题的消息。 Problem with this is, the job will see only messages published to the topic after the Dataflow job has started.这样做的问题是,在数据流作业启动后,作业只会看到发布到主题的消息。 Anything published before to the same topic would be ignored.之前针对同一主题发布的任何内容都将被忽略。

I don't have any complex transformations to do in my job.我的工作中没有任何复杂的转换要做。 So the google provided template would work for me out of the box.所以谷歌提供的模板对我来说开箱即用。 But the lack of ability to specify my own subscription is bothering me.但是无法指定我自己的订阅令我很困扰。 I don't want to setup a custom job pipeline just for this reason.出于这个原因,我不想设置自定义作业管道。 Anybody know if there is a workaround for this?有人知道这是否有解决方法吗?

That's not currently supported. 目前不支持。 However its a great use case and is on the Google Cloud Team's radar. 但是,它是一个很好的用例,并且已经在Google Cloud Team的关注之下。

If you can email me at bookman@google.com I'll be sure to keep you posted on development to enable this. 如果您可以通过bookman@google.com向我发送电子邮件,那么我一定会及时通知您有关开发的信息。

Appreciate the Feedback, 感谢反馈,

Colin 科林

As an update, now there's a separate PubSub Subscription to BigQuery.作为更新,现在有一个单独的 BigQuery PubSub 订阅。

https://cloud.google.com/dataflow/docs/guides/templates/provided-streaming#pubsub-subscription-to-bigquery https://cloud.google.com/dataflow/docs/guides/templates/provided-streaming#pubsub-subscription-to-bigquery

gcloud dataflow jobs run $jobname \
  --project=$project \
  --disable-public-ips \
  --gcs-location gs://dataflow-templates-$location/latest/PubSub_Subscription_to_BigQuery \
  --worker-machine-type n1-standard-1 \
  --region $location \
  --staging-location gs://$bucket/pss-to-bq \
  --parameters inputSubscription=projects/$project/subscriptions/$subscription,outputTableSpec=$dataset.$table

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM