[英]Reducer in Hadoop MapReduce Java Implementation
I am writing a Java implementation program in Hadoop MapReduce Framework. 我正在Hadoop MapReduce Framework中编写Java实现程序。 And I am writing aclass called CombinePatternReduce.class
. 我正在编写名为CombinePatternReduce.class
。 In order to debug the reducer in Eclipse, I write a main()
function as following: 为了在Eclipse中调试reducer,我编写了一个main()
函数,如下所示:
@SuppressWarnings("unchecked")
public static void main(String[] args) throws IOException, InterruptedException{
Text key = new Text("key2:::key1:::_ performs better than _");
IntWritable count5 = new IntWritable(5);
IntWritable count3 = new IntWritable(3);
IntWritable count8 = new IntWritable(8);
List<IntWritable> values = new ArrayList<IntWritable>();
values.add(count5);
values.add(count3);
values.add(count8);
CombinePatternReduce reducer = new CombinePatternReduce();
Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, key, count3); // here is the problem
reducer.reduce(key, values, dcontext);
}
The DebugTools.DebugReducerContext
is a class that I write to make the debugging process easier to perform, and it is as following: DebugTools.DebugReducerContext
是我编写的类,使调试过程更容易执行,如下所示:
public static class DebugReducerContext<KIN, VIN, KOUT, VOUT> extends Reducer<KIN, VIN, KOUT, VOUT>.Context {
DebugTools dtools = new DebugTools();
DataOutput out = dtools.new DebugDataOutputStream(System.out);
public DebugReducerContext(Reducer<KIN, VIN, KOUT, VOUT> reducer, Class<KIN> keyClass, Class<VIN> valueClass) throws IOException, InterruptedException{
reducer.super(new Configuration(), new TaskAttemptID(), new DebugRawKeyValueIterator(), null, null,
null, null, null, null, keyClass, valueClass);
}
@Override
public void write(Object key, Object value) throws IOException, InterruptedException {
writeKeyValue(key, value, out);
}
@Override
public void setStatus(String status) {
System.err.println(status);
}
}
The problem is in the first part of code, namely main()
. 问题出在代码的第一部分,即main()
。 When I write 当我写作
Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, key, count3);
There is an error that 有一个错误
The constructor DebugTools.DebugReducerContext<Text,IntWritable,KeyPairWritableComparable,WrapperDoubleOrPatternWithWeightWritable>(CombinePatternReduce, Text, IntWritable) is undefined.
When I write 当我写作
Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, key, values);
There is an error that 有一个错误
The constructor DebugTools.DebugReducerContext<Text,IntWritable,KeyPairWritableComparable,WrapperDoubleOrPatternWithWeightWritable>(CombinePatternReduce, Text, List<IntWritable>) is undefined.
Since the documentation of Reducer.Context
is 由于Reducer.Context
的文档是
public Reducer.Context(Configuration conf,
TaskAttemptID taskid,
RawKeyValueIterator input,
Counter inputKeyCounter,
Counter inputValueCounter,
RecordWriter<KEYOUT,VALUEOUT> output,
OutputCommitter committer,
StatusReporter reporter,
RawComparator<KEYIN> comparator,
Class<KEYIN> keyClass,
Class<VALUEIN> valueClass)
throws IOException,
InterruptedException
I need to pass in a Class<KEYIN> keyClass
and Class<VALUEIN> valueClass
. 我需要传入一个Class<KEYIN> keyClass
和Class<VALUEIN> valueClass
。 So how can I write the main() function (especially the sentence with error) to debug the reducer class? 那么如何编写main()函数(尤其是带错误的句子)来调试reducer类?
It's very clear that the class constructor takes 3 parameters. 很明显,类构造函数需要3个参数。 An instance of a reducer, a class for the key, and a class for the value. reducer的实例,key的类和值的类。
Instead of actually passing the key and value. 而不是实际传递密钥和值。 You need to supply it with link to the class 您需要为其提供课程的链接
Context dcontext = new DebugTools.DebugReducerContext<Text, IntWritable, KeyPairWritableComparable, WrapperDoubleOrPatternWithWeightWritable>(reducer, Text.class, IntWritable.class);
Essentially this is reaffirming the type of values the context should be able to handle for reducing. 从本质上讲,这是重申上下文应该能够处理的值的类型。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.