我无法在 Hadoop 集群中执行 mapreduce 作业

Question

$ hadoop jar /usr/lib/hadoop/hadoop-streaming-2.6.0-cdh5.13.0.jar -file mapper.py -mapper mapper.py -file reducer.py -reducer reducer.py -input /user/cloudera/test.txt -output /user/cloudera/result

I am using this command to execute MapReduce program using Mapper as mapper.py and Reducer as reducer.py我正在使用此命令执行 MapReduce 程序，使用 Mapper 作为mapper.py和 Reducer 作为reducer.py

It throws an an error Not a valid JAR:/usr/lib/hadoop/hadoop-streaming-2.6.0-cdh5.13.0.jar它抛出一个错误Not a valid JAR:/usr/lib/hadoop/hadoop-streaming-2.6.0-cdh5.13.0.jar

I am using MobaXterm and VMBox & My home directory is /user/cloudera , mapper program location /user/cloudera/mapper.py reducer location /user/cloudera/reducer.py我正在使用 MobaXterm 和 VMBox & 我的主目录是/user/cloudera ，映射程序位置/user/cloudera/mapper.py减速机位置/user/cloudera/reducer.py

Answer 1

If you are using cloudera distribution for practice, the JAR will not be available in the mentioned location /usr/lib/hadoop .如果您使用cloudera发行版进行练习，则JAR将在上述位置/usr/lib/hadoop中不可用。 The hadoop-streaming JAR will be present in /usr/lib/hadoop-mapreduce/ hadoop-streaming JAR 将出现在/usr/lib/hadoop-mapreduce/

Run with the updated location of the JAR , it should work fine.使用JAR的更新位置运行，它应该可以正常工作。

我无法在 Hadoop 集群中执行 mapreduce 作业

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-11-21 13:50:56

我无法在 Hadoop 集群中执行 mapreduce 作业

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-11-21 13:50:56

解决方案1
0 已采纳 2019-11-21 13:50:56