簡體   English   中英

復合密鑰正在更改,Hadoop Map-Reduce?

[英]Composite key getting changed, Hadoop Map-Reduce?

我剛開始學習hadoop,並使用自定義分區程序和比較器運行hadoop map-reduce程序。我面臨的問題是,復合鍵上的主要和次要排序沒有完成,而且是一個復合鍵的一部分正在與其他compsite-key部分一起更改。

例如我在mapper中創建以下鍵

key1 -> tagA,1 
key2 -> tagA,1 
key3 -> tagA,1
key4 -> tagA,1 
key5 -> tagA,2 
key6 -> tagA,2
key7 -> tagB,1 
key8 -> tagB,1 
key9 -> tagB,1
key10 -> tagB,1 
key11 -> tagB,2 
key12 -> tagB,2

分區器和合並器如下

    //Partitioner
public static class TaggedJoiningPartitioner implements Partitioner<Text, Text> {   
    @Override
    public int getPartition(Text key, Text value, int numPartitions) {
        String line = key.toString();
        String tokens[] = line.split(",");
        return (tokens[0].hashCode() & Integer.MAX_VALUE)% numPartitions;
    }
    @Override
    public void configure(JobConf arg0) {
        // TODO Auto-generated method stub //NOT OVERRIDING THIS METHOD
    }
}
//Comparator
public static class TaggedJoiningGroupingComparator extends WritableComparator {

    public TaggedJoiningGroupingComparator() {
        super(Text.class, true);
    }

    @Override
    public int compare(WritableComparable a, WritableComparable b) {
        String taggedKey1[] = ((Text)a).toString().split(",");
        String taggedKey2[] = ((Text)b).toString().split(",");
        return taggedKey1[0].compareTo(taggedKey2[0]);
    }
}

在reducer中,這些鍵根據標簽正確分組,但沒有正確排序。 異化器中鍵的順序和內容如下:

//REDUCER 1
key1 -> tagA,1 
key2 -> tagA,1 
key3 -> tagA,1
key5 -> tagA,1 //2 changed by 1 here
key6 -> tagA,1 //2 changed by 1 here
key4 -> tagA,1 

//REDUCER 2
key7 ->  tagB,1 
key11 -> tagB,1 //2 changed by 1 here
key12 -> tagB,1 //2 changed by 1 here
key8 ->  tagB,1 
key9 ->  tagB,1
key10 -> tagB,1  

試圖長時間解決它,但尚未成功,有任何幫助嗎?

終於成功了,實際上我改變了

conf.setOutputKeyComparatorClass(TaggedJoiningGroupingComparator.class); 

conf.setOutputValueGroupingComparator(TaggedJoiningGroupingComparator.class);

也來自hadoop API文檔。 -

setOutputValueGroupingComparator(Class<? extends RawComparator> theClass)
Set the user defined RawComparator comparator for grouping keys in the input to the reduce.

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM