[英]Is it possible to run a custom python script in Apache beam or google cloud dataflow
I want to run one of my python scripts using GCP.我想使用 GCP 运行我的 python 脚本之一。 I am fairly new to GCP so I don't have a lot of idea.
我对 GCP 还很陌生,所以我不太了解。
My python script grabs data from BigQuery and perform these tasks我的 python 脚本从 BigQuery 抓取数据并执行这些任务
Several data processing operations几种数据处理操作
Build a ML model using KDTree and few clustering algorithms使用 KDTree 和少量聚类算法构建 ML model
Dumping the final result to a Big Query table.将最终结果转储到 Big Query 表。
This script needs to run every night.该脚本需要每晚运行。
So far I know I can use VMs, Cloud Run, Cloud function ( not a good option for me as it will take about an hour to finish everything).到目前为止,我知道我可以使用虚拟机、Cloud Run、Cloud function(对我来说不是一个好选择,因为完成所有事情大约需要一个小时)。 What should be the best choice for me to run this?
什么应该是我运行这个的最佳选择?
I came across Dataflow, but I am curious to know if it's possible to run a custom python script that can do all these things in google cloud dataflow (assuming I will have to convert everything into map-reduce format that doesn't seem easy with my code especially the ML part)?我遇到了 Dataflow,但我很想知道是否可以运行自定义 python 脚本,该脚本可以在谷歌云数据流中执行所有这些操作(假设我必须将所有内容转换为 map-reduce 格式,这似乎并不容易我的代码,尤其是 ML 部分)?
Do you just need a python script to run on a single instance for a couple hours and then terminate?您是否只需要一个 python 脚本在单个实例上运行几个小时然后终止?
You could setup a 'basic scaling' app-engine micro-service within your GCP project.您可以在 GCP 项目中设置“基本扩展”应用引擎微服务。 The max run-time for taskqueue tasks is 24 hours when using 'basic scaling'.
使用“基本缩放”时,任务队列任务的最长运行时间为 24 小时。
Requests can run for up to 24 hours.
请求最多可以运行 24 小时。 A basic-scaled instance can choose to handle /_ah/start and execute a program or script for many hours without returning an HTTP response code.
基本扩展的实例可以选择处理 /_ah/start 并执行程序或脚本数小时而不返回 HTTP 响应代码。 Task queue tasks can run up to 24 hours.
任务队列任务最长可以运行 24 小时。
https://cloud.google.com/appengine/docs/standard/python/how-instances-are-managed https://cloud.google.com/appengine/docs/standard/python/how-instances-are-managed
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.