简体   繁体   中英

Schedule loading data from GCS to BigQuery periodically

I've researched it and currently come up with a strategy using Apache Airflow. I'm still not sure how to do this. The most blogs and answers I'm getting are directly codes instead of some material to better understand it. Also, please suggest if there is a good way to do it.

I also got an answer like using Background Cloud Function with a Cloud Storage trigger .

You can use BigQuery's Cloud Storage transfers , but note that it's still in BETA.

It gives you the option to schedule transfers from Cloud Storage to BigQuery with certain limitations.

在此处输入图片说明

The most blogs and answers I'm getting are directly codes

Apache Airflow comes with a rich UI for many tasks but that doesn't mean you are not supposed to write code in order to get your task done.

For your case, you need to use BigQuery command line operator for Apache Airflow

在此处输入图片说明

A good way on how to do this can be found in this link

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM