Scala / Hadoop：為Reducer指定上下文

Question

在我開始玩Scoobi或Scrunch之前，我想我會嘗試使用Hadoop（0.20.1）的java綁定將WordCount移植到scala（2.9.1）。

最初，我有：

class Map extends Mapper[LongWritable, Text, Text, IntWritable] {
  @throws[classOf[IOException]]
  @throws[classOf[InterruptedException]]
  def map(key : LongWritable, value : Text, context : Context) {
    //...

編譯得很好，但給了我一個運行時錯誤：

java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable

環顧四周之后，我發現這是因為我沒有定義正確的map方法（應該由於缺少override而提示），所以我將其修復為：

override def map(key : LongWritable, value : Text, 
  context : Mapper[LongWritable, Text, Text, IntWritable]#Context) {

瞧，沒有運行時錯誤。

但后來我查看了作業輸出，並意識到我的減速機沒有運行。

所以我查看了我的reducer，並注意到reduce簽名與我的mapper有同樣的問題：

class Reduce extends Reducer[Text, IntWritable, Text, IntWritable] {
  @throws[classOf[IOException]]
  @throws[classOf[InterruptedException]]
  def reduce(key : Text, value : Iterable[IntWritable], context : Context) {
    //...

所以我猜測由於不匹配而使用了身份reduce 。

但是當我試圖糾正reduce的簽名時：

override def reduce(key: Text, values : Iterable[IntWritable], 
  context : Reducer[Text, IntWritable, Text, IntWritable]#Context) {

我現在遇到編譯器錯誤：

[ERROR] /path/to/src/main/scala/WordCount.scala:32: error: method reduce overrides nothing
[INFO]     override def reduce(key: Text, values : Iterable[IntWritable],

所以我不確定我做錯了什么。

Answer 1

乍一看，確保值是java.lang.Iterable，而不是scala Iterable。 導入java.lang.Iterable，或者：

override def reduce(key: Text, values : java.lang.Iterable[IntWritable], context : Reducer[Text, IntWritable, Text, IntWritable]#Context)

Scala / Hadoop：為Reducer指定上下文

問題描述

1 個解決方案

解決方案1
11 已采納 2012-03-25 02:26:25

Scala / Hadoop：為Reducer指定上下文

問題描述

1 個解決方案

解決方案1 11 已采納 2012-03-25 02:26:25

解決方案1
11 已采納 2012-03-25 02:26:25