简体   繁体   English

有什么方法可以将数据从谷歌电子表格传递到 clickhouse?

[英]Any way to deliver data from google spreadsheets to clickhouse?

There is problem that our people fill every day in google spreadsheet some data and I need with a certain frequency (eg once a day) to send these tables in clickhouse问题是我们的人每天都在谷歌电子表格中填写一些数据,我需要以一定的频率(例如每天一次)将这些表格发送到clickhouse

(it located on our aws servers) (它位于我们的 aws 服务器上)

it doesn't matter whether clickhouse writes only new data from tables or all tables every time Clickhouse 每次只从表中写入新数据还是从所有表中写入都没有关系

please tell me a working method how to do it请告诉我一个工作方法怎么做

from the Toolkit are python ,can in theory work with sqlalchemy and airflow DAG but for the development of dag in airflow I have not yet found a guide how to write in python a script to transfer data from googlespreadsheet from the Toolkit are python ,can in theory work with sqlalchemy and airflow DAG but for the development of dag in airflow I have not yet found a guide how to write in python a script to transfer data from googlespreadsheet

the second option is with owox extension for google spreadsheet - but there you need to work with Google BigQuery, and this will breed a zoo, and I would not like to pay for BQ yet第二个选项是使用谷歌电子表格的 owox 扩展 - 但是你需要使用谷歌 BigQuery,这将培育一个动物园,我还不想为 BQ 付费

Do you have any ideas how to use scripts to upload tables to Clickhouse from google spreadsheets?您对如何使用脚本将表格从谷歌电子表格上传到 Clickhouse 有任何想法吗?

I found the Python library pygsheets - it is easier to access spreadsheets using the api than directly我发现了 Python 库pygsheets - 使用 api 访问电子表格比直接访问电子表格更容易

official pygsheets dock - https://pygsheets.readthedocs.io/en/stable/官方 pygsheets 坞站 - https://pygsheets.readthedocs.io/en/stable/

in addition I found more libraries: gspread and oauth2client which can also be used to work on Python with the api step by step guide https://towardsdatascience.com/accessing-google-spreadsheet-data-using-python-90a5bc214fd2 official documentation for gspread https://gspread.readthedocs.io/en/latest/ in addition I found more libraries: gspread and oauth2client which can also be used to work on Python with the api step by step guide https://towardsdatascience.com/accessing-google-spreadsheet-data-using-python-90a5bc214fd2 official documentation for gspread https://gspread.readthedocs.io/en/latest/

than i can make dag at airflow and manage etl process我可以在 airflow 做 dag 并管理 etl 进程

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM