简体   繁体   English

Hadoop windows 上的 MR 作业:无法初始化集群,但从 Idea 开始

[英]Hadoop MR job on windows: Cannot initialize Cluster, but starting from Idea

Windows 10, Hadoop 3.0.0 from winutils project. Windows 10,Hadoop 3.0.0 来自 winutils 项目。 Map reduce job working fine from IDE (Intellij Idea), but failing from windows command line (fat jar): Map 从 IDE(Intellij Idea)减少工作正常,但从 windows 命令行(胖罐)失败:

java -jar target/app1-1.0-SNAPSHOT-jar-with-dependencies.jar "E://folderin" "E://folderout" -Xmx8g

Return error:返回错误:

Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
    at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:116)

In env variables HADOOP_HOME=c:\Hadoop, pom file:在环境变量 HADOOP_HOME=c:\Hadoop, pom 文件中:

    <dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>3.1.1</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <archive>
                        <manifest>
                            <mainClass>com.sample.app1.Starter</mainClass>
                            <addClasspath>true</addClasspath>
                        </manifest>
                    </archive>
                </configuration>

                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>install</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>

            </plugin>
        </plugins>
    </build>

</project>

IntellJ Idea put.m2 folder into class path, that is why the job worked from Idea. IntellJ Idea put.m2 文件夹到 class 路径,这就是工作从 Idea 工作的原因。 The issue resolved by adding hadoop libraries in pom.xml:通过在 pom.xml 中添加 hadoop 库解决了该问题:

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-client</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-mapreduce-client-common</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-mapreduce-client-app</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-mapreduce-client-shuffle</artifactId>
        <version>${hadoop.version}</version>
    </dependency>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM