简体   繁体   English

hadoop jar命令指向本地文件系统

[英]hadoop jar command points to local filesystem

I have a valid jar which is running perfectly on another system running the same version of hadoop ie hadoop-1.2.1 with the same settings. 我有一个有效的jar,它在运行相同版本的hadoop的另一个系统上完美运行,即具有相同设置的hadoop-1.2.1。

I am able to put the jar file in the hdfs filesystem and create input,output directories. 我能够将jar文件放在hdfs文件系统中并创建输入,输出目录。

But when I use the command 'hadoop jar HelloWorld.jar classname(main method) input output' it throws 'Invalid jar' error. 但是当我使用命令'hadoop jar HelloWorld.jar classname(main method)输出'时,它会抛出'Invalid jar'错误。 After searching for the possible solutions for a long time I found out that the command is searching for the jar in local filesystem instead of searching in the hdfs. 在长时间搜索可能的解决方案后,我发现该命令正在本地文件系统中搜索jar而不是在hdfs中搜索。

Even I tried adding scheme to the command as: hadoop jar hdfs://HelloWorld.jar classname(main method) input output 即使我尝试将方案添加到命令中:hadoop jar hdfs://HelloWorld.jar classname(main方法)输入输出

What are the possible solutions to this? 有什么可能的解决方案?

PS: I am able to run the hadoop-examples-1.2.1.jar using 'hadoop jar' when my PWD is /home/user/hadoop-1.2.1 which is in my local filesystem PS:当我的PWD是/home/user/hadoop-1.2.1时,我能够使用'hadoop jar'运行hadoop-examples-1.2.1.jar,这是我本地的文件系统

hadoop jar only runs jar files that you can access locally 1 . hadoop jar只运行你可以在本地访问的jar文件1 Just for the sake of curiosity - here is the relevant source that looks for the jar in the hadoop jar command. 只是为了好奇 - 这是在hadoop jar命令中查找jar的相关源。

public static void main(String[] args) throws Throwable {
  String usage = "RunJar jarFile [mainClass] args...";

  if (args.length < 1) {
    System.err.println(usage);
    System.exit(-1);
  }

  int firstArg = 0;
  String fileName = args[firstArg++];
  File file = new File(fileName);
  if (!file.exists() || !file.isFile()) {
    System.err.println("Not a valid JAR: " + file.getCanonicalPath());
    System.exit(-1);
  }
  ...
}

1 This is true for every Hadoop version I've come accross. 1对于我遇到的每个Hadoop版本都是如此。 Your results may vary. 您的结果可能会有所不

This code in my $HADOOP_HOME/bin/hadoop script 这个代码在我的$ HADOOP_HOME / bin / hadoop脚本中

'elif [ "$COMMAND" = "jar" ] ; then
CLASS=org.apache.hadoop.util.RunJar'

says, it points to RunJar class. 说,它指向RunJar类。

And, in the RunJar you have this, 而且,在RunJar你有这个,

/** Run a Hadoop job jar.  If the main class is not in the jar's manifest,
   * then it must be provided on the command line. */
  public static void main(String[] args) throws Throwable {
    String usage = "RunJar jarFile [mainClass] args...";

    if (args.length < 1) {
      System.err.println(usage);
      System.exit(-1);
    }

    int firstArg = 0;
    String fileName = args[firstArg++];
    File file = new File(fileName);
    String mainClassName = null;

    JarFile jarFile;
    try {
      jarFile = new JarFile(fileName);
    } catch(IOException io) {
      throw new IOException("Error opening job jar: " + fileName)
        .initCause(io);
    }

    ------ Other code -------
}

So, I'm not sure if File file = new File(fileName); 所以,我不确定File file = new File(fileName); can actually point to a HDFS path? 实际上可以指向HDFS路径吗?

May be MapR distribution of Hadoop could do that. 可能是Hadoop的MapR分布可以做到这一点。

Probably, It is too late to reply to this discussion though I did not see any accepted answer so thought of replying to this. 可能,现在回答这个讨论为时已晚,虽然我没有看到任何可接受的答案,所以想到回答这个问题。 Today, I faced the same problem and finally after effort of couple of hours, I am able to resolve it. 今天,我遇到了同样的问题,经过几个小时的努力,我能够解决它。 I found two reasons for the problem of "Not a valid Jar". 我找到了“不是一个有效的罐子”问题的两个原因。

  1. When we refer to Jar from HDFS, it gives this error. 当我们从HDFS引用Jar时,它会出现此错误。 I changed the reference to jar file in local file system and it worked properly. 我在本地文件系统中更改了对jar文件的引用,并且它正常工作。 What I understood is that It is not required to put Jar file in HDFS. 我的理解是,不需要将Jar文件放入HDFS。 'hadoop jar HelloWorld.jar (Refer from your local file system) classname(main method) input output' 'hadoop jar HelloWorld.jar(参考你的本地文件系统)classname(main方法)输入输出'

  2. When you create the Jar file and define Main-Class while creating the Jar file then you don't need to define classname in the command. 在创建Jar文件并在创建Jar文件时定义Main-Class时,您不需要在命令中定义classname。

'hadoop jar HelloWorld.jar classname(main method-This is not required if you have already defined Main-Class while jar file creation) input output' 'hadoop jar HelloWorld.jar classname(主方法 - 如果你已经在创建jar文件时定义了Main-Class,那么这不是必需的)输入输出'

Following will be the command: 'hadoop jar HelloWorld.jar input output' 以下将是命令:'hadoop jar HelloWorld.jar输入输出'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM