[英]Read files from directory to create a ZIP hadoop
I'm looking for Hadoop examples, something more complex than the wordcount example.我正在寻找 Hadoop 示例,比 wordcount 示例更复杂。
What I want to do It's read the files in a directory in Hadoop and get a zip, so I have thought to collect al the files in the map class and create the zip file in the reduce class.我想做的是读取Hadoop中目录中的文件并获取zip,所以我想收集map类中的所有文件并在reduce类中创建zip文件。
Can anyone give me a link to a tutorial or example than can help me to built it?谁能给我一个教程或示例的链接来帮助我构建它?
I don't want anyone to do this for me, I'm asking for a link with better examples than the wordaccount.我不希望任何人为我做这件事,我要求提供一个比 wordaccount 更好的例子的链接。
I almost get it, if you need it: https://github.com/flopezluis/testing-hadoop我几乎明白了,如果你需要它: https : //github.com/flopezluis/testing-hadoop
If your objective is to to normalize the structured data in records, coming in from several inputs and then process it.如果您的目标是规范化记录中的结构化数据,来自多个输入,然后对其进行处理。 Based on it, i think you really need to look at this article which helped me in past.
基于它,我认为你真的需要看看这篇过去对我有帮助的文章。 It included How To Normalize Data Using Hadoop/MapReduce and provide Java based source code as below:
它包括如何使用 Hadoop/MapReduce 规范化数据并提供基于 Java 的源代码如下:
There is another examples about Method for Reading and Writing General Record Structures using new Writable and InputFormat classes in JAVA.还有另一个关于使用 JAVA 中新的 Writable 和 InputFormat 类读取和写入通用记录结构的方法的示例。 Have a look here .
看看这里。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.