简体   繁体   中英

run spark on AWS EMR by passing credentials

I am new to EMR and tried to launch Spark job as a step using something like command-runner.jar spark-submit --deploy-mode cluster --class com.xx.xx.className s3n://mybuckets/spark-jobs.jar
However, the spark job needs credentials as environment variables, my question is what is the best way to pass the credentials as environment variables to the spark jobs.
Thanks!

Have a look here: AWS EMR 4.0 - How can I add a custom JAR step to run shell commands and here: http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hadoop-script.html try running step like this(arguments): /usr/bin/spark-submit --deploy-mode cluster --class

I came to your question googling the solution for myself. Right now as a temp solution, I am passing the credential as cmd line parameters. In the future I am thinking to add a custom bootstrap script which will fetch data from service and create the ~/.aws/credentials and config files. I hope this helps or if you have discovered any other option do post here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM