简体   繁体   English

如何通过 Cloudformation 在 EMR 上运行 Spark 作业

[英]How to run a Spark job on EMR via Cloudformation

I am just getting started with AWS and have been playing around with EMR and CloudFormation.我刚刚开始使用 AWS,并且一直在使用 EMR 和 CloudFormation。 My goal is to write a Cloudformation template that will:我的目标是编写一个 Cloudformation 模板,它将:

1. Create an EMR cluster with Spark and Hadoop installed
2. Run Spark jobs on the EMR cluster. Jobs will be submitted as a JAR or Pyspark files.

I have been able to successfully complete Step 1 but I am not sure how Step 2 is supposed to be done via CloudFormation.我已经能够成功完成第 1 步,但我不确定第 2 步应该如何通过 CloudFormation 完成。

I have been trying to look at a couple of examples on the AWS documentation and other sites but I could not see one where a spark job was being deployed via CloudFormation template.我一直在尝试查看 AWS 文档和其他站点上的几个示例,但我看不到通过 CloudFormation 模板部署 spark 作业的示例。

Any examples or pointers in the right direction would be very helpful.任何正确方向的示例或指示都会非常有帮助。 Thanks in advance!提前致谢!

Change your EMR Cloudformation script like that parameters section of EMR像 EMR 的参数部分一样更改您的 EMR Cloudformation 脚本

StepScriptFilePath:
  Type: String
  Description: Step Scipt to run a bash script or add a java file here
  Default: 's3://s3-bucket/steps/step1.sh'
StepScriptFilePython:
  Type: String
  Description: Step Scipt to run a python file file
  Default: 's3://s3-bucket/steps/step2.py'
StepJar:
  Type: String
  Description: Spark jar file
  Default: 's3://elasticmapreduce/libs/script-runner/script-runner.jar'

add this under EMR properties在 EMR 属性下添加此项

  Steps:
    - ActionOnFailure: CONTINUE
      HadoopJarStep:
        Args:
          - Ref: StepScriptFile
        Jar:
          Ref: StepJar
        MainClass: ''
      Name: run any bash or java job in spark
   - ActionOnFailure: CONTINUE
      HadoopJarStep:
        Args:
          - "spark-submit"
          - Ref: StepScriptFilePython
        Jar: command-runner.jar
      Name: run a python script job

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM