简体   繁体   English

为什么在hadoop中执行MapReduce代码时需要jar文件,而在hadoop中执行任何其他Non-MapReduce Java代码时却不需要

[英]Why is a jar file needed while executing MapReduce code in hadoop but not needed in executing any other Non-MapReduce Java code in hadoop

I would like to know why is jar file needed not a .class file for executing MapReduce code in hadoop. 我想知道为什么jar文件不需要.class文件即可在hadoop中执行MapReduce代码。 Then if Jar file is used then why the same is not done while executing any other Non-MapReduce Java code in hadoop? 然后,如果使用了Jar文件,那么为什么在hadoop中执行任何其他Non-MapReduce Java代码时却没有执行相同的操作? Also while executing any other Non-MapReduce Java code in hadoop why a compiled class is directly mentioned along with hadoop keyword in the command line for eg. 同样,在hadoop中执行任何其他Non-MapReduce Java代码时,为什么在命令行中直接将已编译的类与hadoop关键字一起提及,例如。 if I've a program to display file in hadoop using an url ie. 如果我有一个程序使用网址即在hadoop中显示文件。 for class FileSystemCat 用于类FileSystemCat

    public class FileSystemCat {
    public static void main(String[] args) throws Exception {
    String uri = args[0];
    Configuration conf = new Configuration();
    FileSystem fs = FileSystem.get(URI.create(uri), conf);
    InputStream in = null;
    try {
    in = fs.open(new Path(uri));
    IOUtils.copyBytes(in, System.out, 4096, false);
    } finally {
    IOUtils.closeStream(in);

}
}
}

Command to execute the program after compiling the program is "hadoop FileSystemCat" not "hadoop Java FileSystemCat" . 编译程序后执行程序的命令是“ hadoop FileSystemCat”而不是“ hadoop Java FileSystemCat”。 As in common environment the steps to execute the program would've been: 与在普通环境中一样,执行程序的步骤如下:

Javac FileSystemCat.java
Java FileSystemCat.class
 hadoop jar <jar> [mainClass] args... 

Runs a jar file. 运行一个jar文件。 Users can bundle their Map Reduce code in a jar file and execute it using this command. 用户可以将其Map Reduce代码捆绑在jar文件中,然后使用此命令执行。

 hadoop CLASSNAME 

hadoop script can be used to invoke any class. hadoop脚本可用于调用任何类。

These above commands are doing two thinks. 以上这些命令是在做两点思考。

1)Add all the jars in the Hadoop installation lib directory into the classpath of the running jar or class. 1)将Hadoop安装lib目录中的所有jar添加到正在运行的jar或类的classpath中。

2)Add the configuration directory of hadoop installation to the classpath. 2)将hadoop安装的配置目录添加到类路径。

Thus the running JAR or CLASS will get all classes in the hadoop instalation classpath and all the configurtion files of the installation. 因此,运行中的JAR或CLASS将获取hadoop安装类路径中的所有类以及安装的所有配置文件。

If you are running a JAR or CLASS using simple Java CLASSNAME command you have to separatly add the above two components to the java classpath. 如果使用简单的Java CLASSNAME命令运行JAR或CLASS,则必须将上述两个组件分别添加到Java类路径中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM