简体   繁体   English

Apache Hadoop 2.2中的org.apache.hadoop.mapreduce导入问题

[英]Issue with org.apache.hadoop.mapreduce imports in Apache Hadoop 2.2

I recently installed the new Hadoop 2.2. 我最近安装了新的Hadoop 2.2。 I had previously written a simple Word Count MapReduce program which used to work with ease on CDH4. 我以前写过一个简单的Word Count MapReduce程序,它曾经在CDH4上轻松工作。 But now, I have problems with all org.apache.hadoop.mapreduce imports. 但是现在,我对所有org.apache.hadoop.mapreduce导入都有问题。 Can someone tell me which jar exactly to export to fix these imports? 有人能告诉我出口哪个罐子来修复这些进口吗? The code is as follows just in case someone needs to point out changes I need to make to make sure it runs in Hadoop 2.2. 代码如下,以防有人需要指出我需要做出的更改,以确保它在Hadoop 2.2中运行。

import java.io.IOException;
import java.lang.InterruptedException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class MapRWordCount {
    private final static IntWritable ONE = new IntWritable(1);
    private final static Pattern WORD = Pattern.compile("\\w+");

    public static class WordCountMapper 
            extends Mapper<LongWritable, Text, Text, IntWritable> {
        private final Text word = new Text();

        @Override
        public void map(LongWritable key, Text value, Context context) 
                throws IOException, InterruptedException {

            String valueString = value.toString();
            Matcher matcher = WORD.matcher(valueString);
            while (matcher.find()) {
                word.set(matcher.group().toLowerCase());
                context.write(word, ONE);
            }
        }
    }

    public static class WordCountReducer 
            extends Reducer<Text, IntWritable, Text, IntWritable> {
        private final IntWritable totalCount = new IntWritable();

        @Override
        public void reduce(Text key, Iterable<IntWritable> values, Context context) 
                throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable value : values) {
                sum += value.get();
            }
            totalCount.set(sum);
            context.write(key, totalCount);
        }
    }

    public static void main(String[] args) 
            throws IOException, ClassNotFoundException, InterruptedException {

        if (args.length != 2) {
            System.err.println("Usage: MapRWordCount <input_path> <output_path>");
            System.exit(-1);
        }

        Job job = new Job();
        job.setJarByClass(MapRWordCount.class);
        job.setJobName("MapReduce Word Count");

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setMapperClass(WordCountMapper.class);
        job.setCombinerClass(WordCountReducer.class);
        job.setReducerClass(WordCountReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }

}

I found the JARs in the following locations: 我在以下位置找到了JAR:

$HADOOP_HOME/share/hadoop/common/hadoop-common-2.2.0.jar
$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar

In maven, I had to add the following to the pom.xml and then build cleanly to be able to find the Mapper and Reducer classes in Java: 在maven中,我不得不将以下内容添加到pom.xml中,然后干净地构建以便能够在Java中找到Mapper和Reducer类:

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.2.0</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-core</artifactId>
    <version>2.2.0</version>
</dependency>

Now the following don't throw errors: 现在以下内容不会抛出错误:

import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;

If you're just looking for the location of appropriate JARs in Hadoop 2.2, then look under share/hadoop/{common,hdfs,mapreduce} . 如果你只是在Hadoop 2.2中寻找合适的JAR的位置,那么请查看share/hadoop/{common,hdfs,mapreduce} You will find files ending in -2.2.0.jar that are likely what you are looking for. 您将找到以-2.2.0.jar结尾的文件,这些文件很可能是您要找的。

This should be the same as in CDH4, unless you installed the "MR1" version, which matches the Hadoop 1.x structure. 这应该与CDH4中的相同,除非您安装了与Hadoop 1.x结构匹配的“MR1”版本。

Use this link to find whatever JAR's file you need 使用此链接查找您需要的任何JAR文件

Download them , Right click on your Project Go to Build Path > Configure Build Path > Add External JAR's 下载它们,右键单击您的项目转到构建路径>配置构建路径>添加外部JAR

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 找不到org.apache.hadoop.mapreduce-Hadoop - org.apache.hadoop.mapreduce Not Found— Hadoop 为什么IdentityMapper会在org.apache.hadoop.mapreduce库中消失? - Why IdentityMapper disappears in the org.apache.hadoop.mapreduce library? 如果我使用org.apache.hadoop.mapreduce(new)API,如何配置Hadoop MapReduce映射器输出压缩? - How to configure Hadoop MapReduce mapper output compression if I use org.apache.hadoop.mapreduce (new) API? Hadoop mapreduce错误:org.apache.hadoop.mapreduce.Counter - Hadoop mapreduce Error: org.apache.hadoop.mapreduce.Counter 不是org.apache.hadoop.mapreduce.Mapper - not org.apache.hadoop.mapreduce.Mapper MapReduce Apache Hadoop技术 - MapReduce Apache Hadoop Technology 带有凤凰的MapReduce:org.apache.hadoop.io.LongWritable无法转换为org.apache.hadoop.io.NullWritable - MapReduce with phoenix : org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.NullWritable 找到接口org.apache.hadoop.mapreduce.TaskAttemptContext - Found interface org.apache.hadoop.mapreduce.TaskAttemptContext sqoop:java.lang.NoClassDefFoundError:org / apache / hadoop / mapreduce / InputFormat - sqoop: java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/InputFormat 找到了类org.apache.hadoop.mapreduce.TaskInputOutputContext,但是需要接口 - Found class org.apache.hadoop.mapreduce.TaskInputOutputContext, but interface was expected
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM