简体   繁体   中英

How is running a script using aws emr script-runner different from running it from bash?

I have used script-runner on aws emr, and given that it may look very basic (and maybe stuid) question, but I read many documents and noone answers why we need a script runner in emr, when all it does is executing a script in the master node. Can the same script not be run using a bash?

The script runner is needed when you want to simply execute a script but the entry point is expecting a jar. For example, submitting an EMR Step will execute a "hadoop jar blah ..." command. But if "blah" is a script this will fail. Script runner becomes the jar that the Step expects and then uses its argument (path to script) to execute shell script.

When you are running your script in bash, you need to have the script locally and also you need to set all the configurations to work as you expect it.

With the script-runner you have more options, for example, run it as part of your cluster launch command, as well execute a script that is hosted remotely in S3. See the example from the EMR documentations: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hadoop-script.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM