简体   繁体   English

读取hadoop映射中的json对象减少处理数据

[英]reading json objects in hadoop map reduce for processing data

iam a beginner in hadoop,can any one help me in reading json in mapreduce job. 我是hadoop的初学者,有人可以帮助我阅读mapreduce工作中的json吗?

i have googled and found jaql is suitable for reading json.but i didnot find any documentaion on how it could be implemented in our map reduce job. 我已经谷歌搜索并发现jaql适合阅读json。但是我没有找到关于如何在我们的地图缩减工作中实现的文档。

is there any other framework which supports reading json in map reduce? 还有其他支持在map reduce中读取json的框架吗?

any suggestions on this? 有什么建议吗?

Thanks in Advance 提前致谢

I would rather trust the MapReduce framework itself to handle this. 我宁愿使用MapReduce框架本身来处理此问题。 MapReduce allows us to write custom Inout/Output Formats to handle data which is not supported by it OOTB, like JSON. MapReduce允许我们编写自定义的Inout / Output格式来处理OOTB不支持的数据,例如JSON。 See this question for an example. 有关示例,请参见此问题 I would prefer this as I won't require any third party stuff for this. 我希望这样做,因为我不需要任何第三方的东西。 It's just a matter of extending the MapReduce API(But it's just my choice. Other's may find something else more suitable). 只是扩展MapReduce API的问题(但这只是我的选择。其他人可能会发现其他更合适的东西)。

But, the easiest way, IMHO, would be to use Hive or Pig to handle JSON data. 但是,恕我直言,最简单的方法是使用Hive或Pig处理JSON数据。 You don't have to do much in order to make it work, as both these project have OOTB JSON support. 您无需做很多事情就可以使其工作,因为这两个项目都具有OOTB JSON支持。 See this for Hive-JSON SerDe and this for Pig's JsonLoader and JsonStorage . 蜂房JSON SERDE对猪的JsonLoaderJsonStorage。

HTH 高温超导

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM