I'm running Mahout 0.7 on hadoop 1.0.4. I want to see the result of Reuters dataset for the topic modeling task. However, I'm getting kinda useless result when I use the vectordump tools in Mahout. I've read the following set of instructions for this example: Run cvb in mahout 0.8 .
but after executing vectordump tools, I receive a huge file in the output which contains something like the following lines: {0.01:5.726429339702471E-12,0.05:6.196569958376538E-9,...} which I'm not sure if this is the actual output we are supposed to see for the Reuters dataset.
The same thing has happened and the solution is simple: get their latest version in their svn server: http://svn.apache.org/repos/asf/mahout/trunk
That happens because there is a bug of vectorSize in Mahout 0.7.
我认为他们没有提供您要查找的输出类型https://issues.apache.org/jira/browse/MAHOUT-1470
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.