简体   繁体   English

Hadoop MapReduce Java实现中的Reducer

[英]Reducer in Hadoop MapReduce Java Implementation

I am writing a Java implementation program in Hadoop MapReduce Framework. 我正在Hadoop MapReduce Framework中编写Java实现程序。 And I am writing aclass called CombinePatternReduce.class . 我正在编写名为CombinePatternReduce.class In order to debug the reducer in Eclipse, I write a main() function as following: 为了在Eclipse中调试reducer,我编写了一个main()函数,如下所示:

@SuppressWarnings("unchecked")
public static void main(String[] args) throws IOException, InterruptedException{
    Text key = new Text("key2:::key1:::_ performs better than _");
    IntWritable count5 = new IntWritable(5);
    IntWritable count3 = new IntWritable(3);
    IntWritable count8 = new IntWritable(8);
    List<IntWritable> values = new ArrayList<IntWritable>();
    values.add(count5);
    values.add(count3);
    values.add(count8);
    CombinePatternReduce reducer = new CombinePatternReduce();
    Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, key, count3); // here is the problem
    reducer.reduce(key, values, dcontext);      
}

The DebugTools.DebugReducerContext is a class that I write to make the debugging process easier to perform, and it is as following: DebugTools.DebugReducerContext是我编写的类,使调试过程更容易执行,如下所示:

public static class DebugReducerContext<KIN, VIN, KOUT, VOUT> extends Reducer<KIN, VIN, KOUT, VOUT>.Context {
    DebugTools dtools = new DebugTools();
    DataOutput out = dtools.new DebugDataOutputStream(System.out);

    public DebugReducerContext(Reducer<KIN, VIN, KOUT, VOUT> reducer, Class<KIN> keyClass, Class<VIN> valueClass) throws IOException, InterruptedException{
        reducer.super(new Configuration(), new TaskAttemptID(), new DebugRawKeyValueIterator(), null, null, 
                null, null, null, null, keyClass, valueClass);
    }

    @Override
    public void write(Object key, Object value) throws IOException, InterruptedException {
        writeKeyValue(key, value, out);
    }

    @Override
    public void setStatus(String status) {
        System.err.println(status);
    }
}

The problem is in the first part of code, namely main() . 问题出在代码的第一部分,即main() When I write 当我写作

Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, key, count3);

There is an error that 有一个错误

The constructor DebugTools.DebugReducerContext<Text,IntWritable,KeyPairWritableComparable,WrapperDoubleOrPatternWithWeightWritable>(CombinePatternReduce, Text, IntWritable) is undefined.

When I write 当我写作

Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, key, values);

There is an error that 有一个错误

The constructor DebugTools.DebugReducerContext<Text,IntWritable,KeyPairWritableComparable,WrapperDoubleOrPatternWithWeightWritable>(CombinePatternReduce, Text, List<IntWritable>) is undefined.

Since the documentation of Reducer.Context is 由于Reducer.Context的文档是

public Reducer.Context(Configuration conf,
                       TaskAttemptID taskid,
                       RawKeyValueIterator input,
                       Counter inputKeyCounter,
                       Counter inputValueCounter,
                       RecordWriter<KEYOUT,VALUEOUT> output,
                       OutputCommitter committer,
                       StatusReporter reporter,
                       RawComparator<KEYIN> comparator,
                       Class<KEYIN> keyClass,
                       Class<VALUEIN> valueClass)
                throws IOException,
                       InterruptedException

I need to pass in a Class<KEYIN> keyClass and Class<VALUEIN> valueClass . 我需要传入一个Class<KEYIN> keyClassClass<VALUEIN> valueClass So how can I write the main() function (especially the sentence with error) to debug the reducer class? 那么如何编写main()函数(尤其是带错误的句子)来调试reducer类?

It's very clear that the class constructor takes 3 parameters. 很明显,类构造函数需要3个参数。 An instance of a reducer, a class for the key, and a class for the value. reducer的实例,key的类和值的类。

Instead of actually passing the key and value. 而不是实际传递密钥和值。 You need to supply it with link to the class 您需要为其提供课程的链接

Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, Text.class, IntWritable.class);

Essentially this is reaffirming the type of values the context should be able to handle for reducing. 从本质上讲,这是重申上下文应该能够处理的值的类型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM