简体   繁体   English

hadoop mongodb连接器-输出数据不是mongodb,而是hdfs

[英]hadoop mongodb connector - output data not as mongodb but hdfs

是否可以从hadoop mongodb插件连接器中读取mongodb数据,使用mapreduce hadoop处理数据,并且在输出结果中不使用hadoop mongodb插件连接器,而将mapreduce hadoop的结果照原样保留在hdfs中?

I think this previous answer on SO answers your question, with a minor change: 我认为先前关于SO的答案在稍作改动的情况下回答了您的问题:

Is it possible to read MongoDB data, process it with Hadoop, and output it into a RDBS(MySQL)? 是否可以读取MongoDB数据,使用Hadoop处理并将其输出到RDBS(MySQL)中?

The main difference is that you would set the OutputFormatClass to something like: 主要区别在于您可以将OutputFormatClass设置为类似以下内容:

job.setOutputFormatClass( SequenceFileOutputFormat.class );

You'll also need to set the output path on HDFS you want to save the data to. 您还需要在要将数据保存到的HDFS上设置输出路径。 See their WordCount example for a full code example, but use the above as the output format instead of MongoOutputFormat. 请参阅其WordCount示例以获取完整的代码示例,但是将以上内容用作输出格式,而不是MongoOutputFormat。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM