简体   繁体   中英

run spark on AWS EMR by passing credentials

I am new to EMR and tried to launch Spark job as a step using something like command-runner.jar spark-submit --deploy-mode cluster --class com.xx.xx.className s3n://mybuckets/spark-jobs.jar
However, the spark job needs credentials as environment variables, my question is what is the best way to pass the credentials as environment variables to the spark jobs.

Have a look here: AWS EMR 4.0 - How can I add a custom JAR step to run shell commands and here: http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hadoop-script.html try running step like this(arguments): /usr/bin/spark-submit --deploy-mode cluster --class

I came to your question googling the solution for myself. Right now as a temp solution, I am passing the credential as cmd line parameters. In the future I am thinking to add a custom bootstrap script which will fetch data from service and create the ~/.aws/credentials and config files. I hope this helps or if you have discovered any other option do post here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM