简体繁体中英

On demand slave generation in Hadoop cluster on EC2

原文 2010-08-15 10:09:15 0 3 amazon-ec2/ hadoop/ mapreduce

I am planning to use Hadoop on EC2. Since we have to pay per instance usage, it is not good to have fixed number of instances than what are actually required for the job.

In our application, many jobs are executed concurrently and we do not know the slave requirement all the time. Is it possible to start the hadoop cluster with minimum slaves and later on manage the availability based on requirement?

ie create/destroy slaves on demand

Sub question: Can hadoop cluster manage multiple jobs concurrently?

Thanks

3 answers

The default scheduler that is used in hadoop is a simple FIFO one, you can look into using FairScheduler which assigns a share of the cluster to each of the running jobs and has extensive configuration to control those shares.

As far as EC2 is concerned - you can easily start of with some number of nodes and then once you see that there are too many tasks in the queue and all the slots in the cluster are occupied - add more of them. You will simply have to start up an instance and launch a task tracker on it that will register with the jobtracker.

However you will have to have your own system that will manage startup and shutdown of these nodes.

这似乎很有希望http://hadoop.apache.org/common/docs/r0.17.1/hod.html

Just want to let you know that we are doing some work on this in Apache Whirr . We are tracking progress in WHIRR-214 . Vote or join development. :)

Jenkins error trying to raise on-demand linux ec2 slave

Changing owner of a hadoop cluster on EC2

Installing Hbase / Hadoop on EC2 cluster

SSH Tunnel to access EC2 Hadoop Cluster

Running hadoop jobs on Amazon ec2: multi node cluster

How to spin-up Spark EC2 cluster with Hadoop 2.6

How to restart single node hadoop cluster on ec2

EC2 for handling demand spikes

Recommendations for Hadoop on EC2?

hadoop installation on ec2

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Jenkins error trying to raise on-demand linux ec2 slave Changing owner of a hadoop cluster on EC2 Installing Hbase / Hadoop on EC2 cluster SSH Tunnel to access EC2 Hadoop Cluster Running hadoop jobs on Amazon ec2: multi node cluster How to spin-up Spark EC2 cluster with Hadoop 2.6 How to restart single node hadoop cluster on ec2 EC2 for handling demand spikes Recommendations for Hadoop on EC2? hadoop installation on ec2

Related Tags

On demand slave generation in Hadoop cluster on EC2

Question

3 answers

solution1
1 2010-09-03 02:27:44

solution2
0 ACCPTED 2010-09-09 14:50:59

solution3
0 2011-06-10 09:57:00

On demand slave generation in Hadoop cluster on EC2

Question

3 answers

solution1 1 2010-09-03 02:27:44

solution2 0 ACCPTED 2010-09-09 14:50:59

solution3 0 2011-06-10 09:57:00

solution1
1 2010-09-03 02:27:44

solution2
0 ACCPTED 2010-09-09 14:50:59

solution3
0 2011-06-10 09:57:00