[英]Java Spark how to save a JavaPairRDD<HashSet<String>, HashMap<String, Double>> to file?
I got this " JavaPairRDD<HashSet<String>, HashMap<String, Double>>
" RDD after some complicated aggregations, want to save the result to file. 经过一些复杂的聚合后,我得到了这个“
JavaPairRDD<HashSet<String>, HashMap<String, Double>>
” RDD,想将结果保存到文件中。 I believe saveAsHadoopFile
is a good API to do so, but am having trouble filling in the parameters for saveAsHadoopFile(path, keyClass, valueClass, outputFormatClass, CompressionCodec)
. 我相信
saveAsHadoopFile
是这样做的一个不错的API,但是在为saveAsHadoopFile(path, keyClass, valueClass, outputFormatClass, CompressionCodec)
填写参数时遇到了麻烦。 Can anyone help? 有人可以帮忙吗?
You can use the following function and later on parse it to the desired result. 您可以使用以下函数,稍后再将其解析为所需的结果。
rdd.saveAsTextFile ("hdfs:///complete_path_to_hdfs_file/");
but if you want to use saveAsHadoopFile API then following method can be used. 但是,如果要使用saveAsHadoopFile API,则可以使用以下方法。
saveAsHadoopFile(complete_path_to_file, HashSet.class, HashMap.class, TextOutputFormat.class)
you can also use HadoopOutputFormat.class
as the last parameter 您还可以使用
HadoopOutputFormat.class
作为最后一个参数
For more information, you can refer to this link HadoopFile 有关更多信息,您可以参考此链接HadoopFile。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.