简体   繁体   English

如何使用 boto3 运行现有的 EMR 无服务器作业?

[英]How to run existing EMR serverless job with boto3?

From boto3 doc for the start_job_run, it seems like I have to create job run every time I want to trigger a job.从 boto3 doc for the start_job_run 看来,每次我想触发作业时我都必须创建作业运行。 Does it really have to work that way?它真的必须那样工作吗? Can't I take the ID of the existing job, which has already been defined with all the configuration it needs, and run it?我不能获取已经定义了所有所需配置的现有作业的 ID 并运行它吗?

Reading the doc and searching on the internet阅读文档并在互联网上搜索

Yes that's the way it needs to be executed with boto3.是的,这就是它需要用 boto3 执行的方式。 You can call describe-job-run to gain the config information and then pass that to start-job-run .您可以调用describe-job-run获取配置信息,然后将其传递给start-job-run

This assumes that you have a short lived cluster.这假设您有一个短暂的集群。 Executing jobs on a long lived cluster would be different.在长期集群上执行作业会有所不同。

EMR Serverless doesn't have "jobs" or templates (similar to EMR on EKS ) where you can define all parameters and then reuse them for job runs, but only "job runs" themselves. EMR Serverless 没有“作业”或模板(类似于EKS 上的 EMR ),您可以在其中定义所有参数,然后将它们重新用于作业运行,但只有“作业运行”本身。 So yes, you have to specify all parameters every time.所以是的,您必须每次都指定所有参数。 You should be able to copy-paste job run config from another job run - use GetJobRun API for that.您应该能够从另一个作业运行复制粘贴作业运行配置 - 为此使用 GetJobRun API。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM