[英]Google Cloud DataStream to Bigquery template not able to sync data to big query
I am trying to design CDC pipeline to stream data from cloud SQL to BigQuery using DataStreams and Dataflow on GCP, the datastream part is working fine and I can see data being transferred to CloudStorage successfully in avro format.我正在尝试使用 GCP 上的 DataStreams 和 Dataflow 将 CDC 管道设计到 stream 数据从云 SQL 到 BigQuery,数据流部分工作正常,我可以看到数据以 avro 格式成功传输到 CloudStorage。
When it comes to DataFlow, I am using DataFlow Template DataStream to BigQuery
with the configuration in the screenshot当谈到 DataFlow 时,我使用 DataFlow Template
DataStream to BigQuery
与屏幕截图中的配置
I can see the DataFlow job started and running with no errors in the log, yet I can't see any data transfer happening from Cloud Storage to BigQuery.我可以在日志中看到 DataFlow 作业已启动并运行且没有错误,但我看不到从 Cloud Storage 到 BigQuery 的任何数据传输。
It looks to me there is something missing, which is the link between Cloud storage and Pub/Sub, I think it there should be a link to stream the data from GCS to Pub/Sub, and eventually DataFlow stream from Pub/Sub to BQ, no?在我看来,缺少一些东西,这是云存储和 Pub/Sub 之间的链接,我认为应该有一个链接到 stream 的数据,从 GCS 到 Pub/Sub,最终从 Pub/Sub 到 BQ 的 DataFlow stream , 不?
What I am missing here?我在这里缺少什么?
It was something missing from my side which is setting up the link between GCS and Pub/Sub using the blow command我这边缺少的东西是使用 blow 命令在 GCS 和 Pub/Sub 之间建立链接
gsutil notification create -f "none" -p "db/" -t "datastream" "gs://my-buk"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.