如何使用mapreduce計數特定單詞？

Question

我正在修改普通的單詞計數程序，該程序對每個單詞進行計數，使其僅對特定單詞計數。

reducer和map類與普通字數相同。 無法正確計算字數。 我在文件中多次出現相同的特定單詞，但計數為一個。

public class wordcountmapper extends MapReduceBase implements Mapper<LongWritable, Tex, Text, IntWritable>                       // mapper function implemented.
{
    private final static IntWritable one = new IntWritable(1); // intwritable
    private Text word = new Text();

    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
        String line = value.toString();      // conversion in string
        StringTokenizer tokenizer = new StringTokenizer(line);
        while (tokenizer.hasMoreTokens()) {
            word.set(tokenizer.nextToken());
            if (line.compareTo("Cold") == 0) {  //cold is the specific word to get count for
                output.collect(word, one);      // getting 1 as a count for 'cold' as if its counting only first line 'cold' and not going to next line.
            }
        }
    }
}

Answer 1

首先，您的if statement將行對象與“ Cold”進行比較，這是錯誤的。 應該將標記詞與“ Cold” if(tokenizer.nextToken().equals("Cold")) 。

我不確定在當前邏輯下如何將“冷”的計數設為1。可能在輸入中有一行帶有單個單詞的行“冷”。

如何使用mapreduce計數特定單詞？

問題描述

1 個解決方案

解決方案1
1 已采納 2015-10-05 07:42:23

如何使用mapreduce計數特定單詞？

問題描述

1 個解決方案

解決方案1 1 已采納 2015-10-05 07:42:23

解決方案1
1 已采納 2015-10-05 07:42:23