简体   繁体   中英

WrongValueClass in apache Mahout

I have written a mapreduce programm using mahout. the map output value is ClusterWritable .when i run the code in eclipse, it is run with no error, but when i run rhe jar file in terminal, it shows the exception:

java.io.IOException: wrong value class: org.apache.mahout.math.VectorWritable is not class org.apache.mahout.clustering.iterator.ClusterWritable
at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:988)
at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:74)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:498)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.mahout.clustering.canopy.CanopyMapper.cleanup(CanopyMapper.java:59)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

The output code in map is:

context.write(new Text(), new ClusterWritable());

but i don't know why it says that the value type is VectorWritable .

Mapper being run, resulting in stacktrace above is Mahout's CanopyMapper, and not custom one you've written. CanopyMapper.cleanup method is outputting (key: Text, value: VectorWritable). See CanopyMapper.java

See also CanopyDriver.java and its buildClustersMR method, where MR job is configured, mapper, reducer, and appropriate output key/value classes.

You didn't state, so I'm guessing that you're using more than one MR job in a data flow pipeline. Check that outputs of each job in pipeline are valid/expected input for next job in pipeline. Consider using cascading/scalding to define your data flow (see http://www.slideshare.net/melrief/scalding-programming-model-for-hadoop )

Consider using Mahout user mailing list to post Mahout related questions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM