简体繁体 English

使用mapreduce处理文件

[英]processing file using mapreduce

原文 2016-02-03 08:56:45 0 1 hadoop/ mapreduce/ apache-pig/ cloudera

I use simple pig script that reads the input .txt file and for each line new filed is added. 我使用简单的Pig脚本读取输入的.txt文件，并为每行添加新字段。

The output relation is then stored into avro. 然后将输出关系存储到Avro中。

Is there any benefit to run such a script in the mapreduce mode compare to local mode? 与本地模式相比，在mapreduce模式下运行这样的脚本有什么好处？

Thank you 谢谢

1 个解决方案

In local mode you are running your job on your local machine. 在本地模式下，您正在本地计算机上运行作业。 With mapreduce you run your job in a cluster (your file will be splitted into pieces and will be processed on several machines in parallel). 使用mapreduce，您可以在群集中运行您的作业（您的文件将被分割成几部分，并将在多台计算机上并行处理）。

So, in theory, if your file is big enough (or there are lots of files like this to process), you'll be able to accomplish your job in less time with mapreduce mode. 因此，从理论上讲，如果您的文件足够大（或者有很多这样的文件要处理），则可以使用mapreduce模式在更少的时间内完成您的工作。

在mapreduce中处理文件的子集 - processing subset of a file in mapreduce

使用java Mapreduce处理JSON - Processing JSON using java Mapreduce

使用弹性MapReduce进行文件处理 - 没有减速器步骤？ - File Processing with Elastic MapReduce - No Reducer Step?

使用mapreduce处理hadoop中的压缩xml文件 - processing zipped xml files in hadoop using mapreduce

使用Amazon MapReduce / Hadoop进行图像处理 - Using Amazon MapReduce/Hadoop for Image Processing

使用MapReduce分析日志文件 - Using MapReduce to analyze log file

使用mapreduce在hadoop中进行文件比较 - file comparison in hadoop using mapreduce

MapReduce处理如何与本地文件系统一起工作？ - How does MapReduce processing work with Local File System?

Mapreduce-处理大型xml文件时出现堆内存问题 - Mapreduce - Heap memory issue while processing large xml file

如何在MapReduce中从Mapper跳过当前正在处理的文件 - How to skip currently processing file from Mapper in MapReduce

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在mapreduce中处理文件的子集 - processing subset of a file in mapreduce 使用java Mapreduce处理JSON - Processing JSON using java Mapreduce 使用弹性MapReduce进行文件处理 - 没有减速器步骤？ - File Processing with Elastic MapReduce - No Reducer Step? 使用mapreduce处理hadoop中的压缩xml文件 - processing zipped xml files in hadoop using mapreduce 使用Amazon MapReduce / Hadoop进行图像处理 - Using Amazon MapReduce/Hadoop for Image Processing 使用MapReduce分析日志文件 - Using MapReduce to analyze log file 使用mapreduce在hadoop中进行文件比较 - file comparison in hadoop using mapreduce MapReduce处理如何与本地文件系统一起工作？ - How does MapReduce processing work with Local File System? Mapreduce-处理大型xml文件时出现堆内存问题 - Mapreduce - Heap memory issue while processing large xml file 如何在MapReduce中从Mapper跳过当前正在处理的文件 - How to skip currently processing file from Mapper in MapReduce

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM