简体   繁体   English

Mahout:java.lang.NumberFormatException:对于输入字符串:

[英]Mahout : java.lang.NumberFormatException: For input string:

I am trying to get mahout working and I am getting the following error : 我想让mahout工作,我收到以下错误:

3/05/16 22:48:53 INFO mapred.MapTask: record buffer = 262144/327680
13/05/16 22:48:53 WARN mapred.LocalJobRunner: job_local_0001
java.lang.NumberFormatException: For input string: "1119"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Long.parseLong(Long.java:430)
    at java.lang.Long.parseLong(Long.java:483)
    at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
    at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
13/05/16 22:48:54 INFO mapred.JobClient:  map 0% reduce 0%
13/05/16 22:48:54 INFO mapred.JobClient: Job complete: job_local_0001
13/05/16 22:48:54 INFO mapred.JobClient: Counters: 0
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /user/eric.waite/temp/preparePreferenceMatrix/numUsers.bin
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)

My input file is very simple : (sample) userid, storyId, rating (1-5) 我的输入文件非常简单:(示例)userid,storyId,rating(1-5)

2840281,1119,2
2840321,1170,3
2840323,1124,5
2840371,1170,5
2840347,1157,3
2840371,1172,5
2840347,1157,5
2840358,1333,5
2840371,1172,5
2840347,1157,5

I am trying to run a basic example using the following command : 我试图使用以下命令运行一个基本示例:

hadoop jar /sourcecode/mahout/mahout-distribution-0.7/mahout-core-0.7-job.jar org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -s SIMILARITY_COOCCURRENCE --input ratings.dat --output output

Java information: Java信息:

java version "1.7.0_13" Java(TM) SE Runtime Environment (build 1.7.0_13-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) I am on a mac 10.8.2 java版“1.7.0_13”Java(TM)SE运行时环境(版本1.7.0_13-b20)Java HotSpot(TM)64位服务器VM(版本23.7-b01,混合模式)我在mac 10.8.2上

Does anyone have any suggestions on why the integer is being read as a string and is generating the NumberFormatException ? 有没有人对为什么整数被读取为字符串并生成NumberFormatException有任何建议?

Thank you. 谢谢。

You likely have some non-printing character funny business in here. 你可能在这里有一些非打印角色搞笑的事情。 The string it shows, of course, parses just fine as a long. 当然,它显示的字符串解析得很好。 (The quotes are only part of its error message.) (引号只是其错误消息的一部分。)

To see what I mean, try 看看我的意思,试试吧

    System.out.println(Long.parseLong("\u00001119"));

It fails with the same error, one that is on its face puzzling. 它失败了同样的错误,令人费解的是它。

Not sure how to debug this easily short of a hex editor. 不知道如何轻松调试这个十六进制编辑器。

You can debug the RecommendJob and check where the exception occurs and check the actual string value, maybe some blank or useless character in the input file. 您可以调试RecommendJob并检查异常发生的位置并检查实际的字符串值,可能是输入文件中的一些空白或无用的字符。 I also have this exception, and my exception occurs here: 我也有这个例外,我的异常发生在这里:

String[] tokens = TasteHadoopUtils.splitPrefTokens(value.toString());
long itemID = Long.parseLong(tokens[transpose ? 0 : 1]);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM