简体繁体中英

Number of concurrently running mappers per node drops precipitously on Elastic MapReduce w/ AMI 3.1.0 and Hadoop 2.4.0 as cluster size increases

原文 2014-08-10 13:31:01 0 1 hadoop/ amazon-web-services/ amazon-ec2/ elastic-map-reduce/ yarn

In a related question ( How to set the precise max number of concurrently running tasks per node in Hadoop 2.4.0 on Elastic MapReduce ), I ask for formulas relating the number of concurrently running mappers/reducers to YARN and MR2 memory parameters. It turns out that on Elastic MapReduce, when my cluster has between 2 and 10 c3.2xlarge nodes, variations of the formulas mentioned there work okay, giving me 7-9 concurrently running mappers per node; but when the number of c3.2xlarges is 20 or 40, I get cluster underutilization: only 1-4 mappers run per node. Since my job is CPU-bound, this is particularly awful: MR2 delivers _half_the performance of MR1 for me.

Why is this happening?

1 answers

You will be limited from what the NameNode can dish out. You can and should specific a larger instance type for the NameNode when increase your Task nodes as such. The MR1 page was never updated for c3s http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/TaskConfiguration.html

Hadoop MapReduce: default number of mappers

Hadoop MapReduce : Number of mappers with TextInputFormat

Hadoop 2.4.0 + HCatalog + Mapreduce

Running elastic mapreduce streaming on AMI 3.0.1

Running a mapreduce jar on Hadoop cluster

relation between number of input splits and number of mappers in mapreduce hadoop

how to increase number of mappers and reducers in mapreduce program in hadoop?

Limiting the number of mappers running on Hadoop Streaming

YARN: How to run MapReduce jobs with lot of mappers comparing to cluster size

Running mapreduce java programs on hadoop cluster

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Hadoop MapReduce: default number of mappers Hadoop MapReduce : Number of mappers with TextInputFormat Hadoop 2.4.0 + HCatalog + Mapreduce Running elastic mapreduce streaming on AMI 3.0.1 Running a mapreduce jar on Hadoop cluster relation between number of input splits and number of mappers in mapreduce hadoop how to increase number of mappers and reducers in mapreduce program in hadoop? Limiting the number of mappers running on Hadoop Streaming YARN: How to run MapReduce jobs with lot of mappers comparing to cluster size Running mapreduce java programs on hadoop cluster

Related Tags

Number of concurrently running mappers per node drops precipitously on Elastic MapReduce w/ AMI 3.1.0 and Hadoop 2.4.0 as cluster size increases

Question

1 answers

solution1 1 ACCPTED 2014-08-11 01:43:05

solution1
1 ACCPTED 2014-08-11 01:43:05