简体繁体中英

Does hadoop mapreduce open temporary files in hdfs

原文 2014-09-23 02:44:07 8 2 hadoop/ mapreduce/ hdfs

When a map-reduce job runs, it must be creating a lot of temporary files for storing results of various mappers and reducers. Are those temporary files written to hdfs.

If yes, the namenode's editlog could become huge in a short time given that it records each and every transaction like file open, close etc. Can that be avoided by directly writing to the native filesystem instead of hdfs or is that a bad idea?

2 answers

Intermediate result of map reduce code has been written to local file system not hdfs and automatically it gets removed after completion of job.

I mean to say output from mapper has been written to local file system, specific location can be configured but by default it writes into /tmp/hadoop-username* location

You mean to say, the temporary files are created each time when the Mapper runs. If yes, then you can't avoid this because Mapper's output are written to disk rather than in-memory. The TaskTracker would take care of creating setup for MR job and creating temporary disk space for Mapper intermediate output. Also the temporary space would be cleaned by TaskTracker once MR job completed.

This is one of bottleneck of MR programming paradigm.

Any comments/feedback would be appreciated.

Opening files on HDFS from Hadoop mapreduce job

Hadoop Mapreduce HDFS block split

Hadoop HDFS MapReduce output into MongoDb

How to test hadoop mapreduce with hdfs?

Accessing files from other filesystems along with hdfs files in a hadoop mapreduce application

Hadoop mapreduce 2 files filtering?

Hadoop's Hive/Pig, HDFS and MapReduce relationship

hadoop2 build hdfs without yarn and mapreduce

How to separate Hadoop MapReduce from HDFS?

Hadoop mapReduce How to store only values in HDFS

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Opening files on HDFS from Hadoop mapreduce job Hadoop Mapreduce HDFS block split Hadoop HDFS MapReduce output into MongoDb How to test hadoop mapreduce with hdfs? Accessing files from other filesystems along with hdfs files in a hadoop mapreduce application Hadoop mapreduce 2 files filtering? Hadoop's Hive/Pig, HDFS and MapReduce relationship hadoop2 build hdfs without yarn and mapreduce How to separate Hadoop MapReduce from HDFS? Hadoop mapReduce How to store only values in HDFS

Related Tags

Does hadoop mapreduce open temporary files in hdfs

Question

2 answers

solution1
1 ACCPTED 2014-09-23 06:39:30

solution2
0 2014-09-23 05:00:40

Does hadoop mapreduce open temporary files in hdfs

Question

2 answers

solution1 1 ACCPTED 2014-09-23 06:39:30

solution2 0 2014-09-23 05:00:40

solution1
1 ACCPTED 2014-09-23 06:39:30

solution2
0 2014-09-23 05:00:40