简体   繁体   English

如何读取Pig中的SEQ文件

[英]How to read SEQ files in pig

I have M, U, and userRatings part-files as an intermediate result of an ALS matrix factorization process. 我有M,U和userRatings零件文件,这是ALS矩阵分解过程的中间结果。

The header are: 标头为:

SEQ. 序列 org.apache.hadoop.io.IntWritable%org.apache.mahout.math.VectorWritable org.apache.hadoop.io.IntWritable%org.apache.mahout.math.VectorWritable

I need to operate with that vectors/features, to find out an explanation for the ALS recommendations (it is a guess). 我需要使用这些向量/功能,以找到有关ALS建议的解释(这是一个猜测)。 It need to be on PIG. 它必须在PIG上。

Thanks, Er 谢谢Er

Try this link, it has lot of examples about how to load,store and process the SEQ files using elephantbird. 试试这个链接,它有很多有关如何使用Elephantbird加载,存储和处理SEQ文件的示例。

Ex: 例如:

     pair = LOAD '$data' USING com.twitter.elephantbird.pig.load.SequenceFileLoader (
       '-c com.twitter.elephantbird.pig.util.IntWritableConverter', 
       '-c com.twitter.elephantbird.pig.mahout.VectorWritableConverter'
     ) AS (key: int, val: (f1: double, f2: double, f3: double));

http://grepcode.com/file/repo1.maven.org/maven2/com.twitter.elephantbird/elephant-bird-mahout/3.0.1/com/twitter/elephantbird/pig/mahout/VectorWritableConverter.java http://grepcode.com/file/repo1.maven.org/maven2/com.twitter.elephantbird/elephant-bird-mahout/3.0.1/com/twitter/elephantbird/pig/mahout/VectorWritableConverter.java

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM