[英]Composite key getting changed, Hadoop Map-Reduce?
I have just started learning hadoop,and running hadoop map-reduce program with custom partitioner and comparator.The problem i am facing is that the primary and secondary sort are not getting done on composite key, more-over the part of one composite-key is getting changed with other compsite-key part. 我刚开始学习hadoop,并使用自定义分区程序和比较器运行hadoop map-reduce程序。我面临的问题是,复合键上的主要和次要排序没有完成,而且是一个复合键的一部分正在与其他compsite-key部分一起更改。
for example i am creating the following keys inside mapper 例如我在mapper中创建以下键
key1 -> tagA,1
key2 -> tagA,1
key3 -> tagA,1
key4 -> tagA,1
key5 -> tagA,2
key6 -> tagA,2
key7 -> tagB,1
key8 -> tagB,1
key9 -> tagB,1
key10 -> tagB,1
key11 -> tagB,2
key12 -> tagB,2
and partitioner and combiner are as follows 分区器和合并器如下
//Partitioner
public static class TaggedJoiningPartitioner implements Partitioner<Text, Text> {
@Override
public int getPartition(Text key, Text value, int numPartitions) {
String line = key.toString();
String tokens[] = line.split(",");
return (tokens[0].hashCode() & Integer.MAX_VALUE)% numPartitions;
}
@Override
public void configure(JobConf arg0) {
// TODO Auto-generated method stub //NOT OVERRIDING THIS METHOD
}
}
//Comparator
public static class TaggedJoiningGroupingComparator extends WritableComparator {
public TaggedJoiningGroupingComparator() {
super(Text.class, true);
}
@Override
public int compare(WritableComparable a, WritableComparable b) {
String taggedKey1[] = ((Text)a).toString().split(",");
String taggedKey2[] = ((Text)b).toString().split(",");
return taggedKey1[0].compareTo(taggedKey2[0]);
}
}
in reducer these key are grouped properly according to tags but not sorted properly. 在reducer中,这些键根据标签正确分组,但没有正确排序。 The order and content of keys in reducers is as follows: 异化器中键的顺序和内容如下:
//REDUCER 1
key1 -> tagA,1
key2 -> tagA,1
key3 -> tagA,1
key5 -> tagA,1 //2 changed by 1 here
key6 -> tagA,1 //2 changed by 1 here
key4 -> tagA,1
//REDUCER 2
key7 -> tagB,1
key11 -> tagB,1 //2 changed by 1 here
key12 -> tagB,1 //2 changed by 1 here
key8 -> tagB,1
key9 -> tagB,1
key10 -> tagB,1
trying for long-time to resolve it but not succeded yet, Any help appreciated ? 试图长时间解决它,但尚未成功,有任何帮助吗?
Finally got it working, actually i changed 终于成功了,实际上我改变了
conf.setOutputKeyComparatorClass(TaggedJoiningGroupingComparator.class);
to 至
conf.setOutputValueGroupingComparator(TaggedJoiningGroupingComparator.class);
Also From hadoop API docs. 也来自hadoop API文档。 -- -
setOutputValueGroupingComparator(Class<? extends RawComparator> theClass)
Set the user defined RawComparator comparator for grouping keys in the input to the reduce.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.