简体   繁体   English

电子邮件和地图减少工作

[英]Emails and Map Reduce Job

I'm just starting out with Hadoop and writing some Map Reduce jobs. 我只是从Hadoop开始,然后编写一些Map Reduce作业。 I was looking for help on writing a MR job in python that allows me to take some emails and put them into HDFS so I can search on the text or attachments of the email? 我正在寻找有关在python中编写MR作业的帮助,该作业使我可以接收一些电子邮件并将其放入HDFS,以便我可以搜索电子邮件的文本或附件?

Thank you! 谢谢!

For handling the emails, the email module from the stdlib is probably going to be handy. 对于处理电子邮件,来自stdlib的email模块可能会很方便。 For the Hadoop side of things, Using Python with Hadoop might be handy, although there are plenty of Google results to choose from. 对于Hadoop而言,尽管有很多Google搜索结果可供选择,但将Python与Hadoop结合使用可能会很方便。

是的,如果要使用编写Python代码来运行MapReduce作业,则需要使用hadoop流

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM