[英]How to specify a python script with opt parser as mapper in Hadoop Streaming
How do I specify a python script which as opt parser (and accepts multiple arguments ) as a mapper in Hadoop Streaming ?? 如何在Hadoop Streaming中指定一个python脚本作为opt解析器(并接受多个参数)作为映射器?
For example , 例如 ,
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-mapper myPythonScript.py \
-reducer /bin/wc \
-file myPythonScript.py
This the normal command to execute python script using hadoop streaming. 这是使用hadoop流执行python脚本的正常命令。 How to mention if the myPythonScript.py has opt parser.
如何提及myPythonScript.py是否具有opt解析器。 For eg.
例如。
python myPythonscript.py -g --inputfile=Inputfilename --output=Ouputfilename -r
How do I specify this as mapper ?? 如何将其指定为mapper?
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-input myInputDirs \
-output myOutputDir \
-mapper "python myPythonscript.py -g --inputfile=Inputfilename --output=Ouputfilename -r" \
-reducer /bin/wc \
-file myPythonScript.py
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.