简体   繁体   中英

hadoop jar command points to local filesystem

I have a valid jar which is running perfectly on another system running the same version of hadoop ie hadoop-1.2.1 with the same settings.

I am able to put the jar file in the hdfs filesystem and create input,output directories.

But when I use the command 'hadoop jar HelloWorld.jar classname(main method) input output' it throws 'Invalid jar' error. After searching for the possible solutions for a long time I found out that the command is searching for the jar in local filesystem instead of searching in the hdfs.

Even I tried adding scheme to the command as: hadoop jar hdfs://HelloWorld.jar classname(main method) input output

What are the possible solutions to this?

PS: I am able to run the hadoop-examples-1.2.1.jar using 'hadoop jar' when my PWD is /home/user/hadoop-1.2.1 which is in my local filesystem

hadoop jar only runs jar files that you can access locally 1 . Just for the sake of curiosity - here is the relevant source that looks for the jar in the hadoop jar command.

public static void main(String[] args) throws Throwable {
  String usage = "RunJar jarFile [mainClass] args...";

  if (args.length < 1) {
    System.err.println(usage);
    System.exit(-1);
  }

  int firstArg = 0;
  String fileName = args[firstArg++];
  File file = new File(fileName);
  if (!file.exists() || !file.isFile()) {
    System.err.println("Not a valid JAR: " + file.getCanonicalPath());
    System.exit(-1);
  }
  ...
}

1 This is true for every Hadoop version I've come accross. Your results may vary.

This code in my $HADOOP_HOME/bin/hadoop script

'elif [ "$COMMAND" = "jar" ] ; then
CLASS=org.apache.hadoop.util.RunJar'

says, it points to RunJar class.

And, in the RunJar you have this,

/** Run a Hadoop job jar.  If the main class is not in the jar's manifest,
   * then it must be provided on the command line. */
  public static void main(String[] args) throws Throwable {
    String usage = "RunJar jarFile [mainClass] args...";

    if (args.length < 1) {
      System.err.println(usage);
      System.exit(-1);
    }

    int firstArg = 0;
    String fileName = args[firstArg++];
    File file = new File(fileName);
    String mainClassName = null;

    JarFile jarFile;
    try {
      jarFile = new JarFile(fileName);
    } catch(IOException io) {
      throw new IOException("Error opening job jar: " + fileName)
        .initCause(io);
    }

    ------ Other code -------
}

So, I'm not sure if File file = new File(fileName); can actually point to a HDFS path?

May be MapR distribution of Hadoop could do that.

Probably, It is too late to reply to this discussion though I did not see any accepted answer so thought of replying to this. Today, I faced the same problem and finally after effort of couple of hours, I am able to resolve it. I found two reasons for the problem of "Not a valid Jar".

  1. When we refer to Jar from HDFS, it gives this error. I changed the reference to jar file in local file system and it worked properly. What I understood is that It is not required to put Jar file in HDFS. 'hadoop jar HelloWorld.jar (Refer from your local file system) classname(main method) input output'

  2. When you create the Jar file and define Main-Class while creating the Jar file then you don't need to define classname in the command.

'hadoop jar HelloWorld.jar classname(main method-This is not required if you have already defined Main-Class while jar file creation) input output'

Following will be the command: 'hadoop jar HelloWorld.jar input output'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM