How to spin-up Spark EC2 cluster with Hadoop 2.6

Question

I am trying to get Spark EC2 cluster running on Spark 1.6.1 with Hadoop 2.6

-Here what I have tried:

./spark-ec2 -i ~/.ssh/***.pem \
--instance-profile-name *** \
-k *** \
--region=us-east-1 \
--instance-type=m3.xlarge \
-s 2 \
--copy-aws-credentials \
launch test-cluster

However, this installed Hadoop 1.0. So I added the following option to the above command:

--hadoop-major-version=2 \

However, I soon realized that in order to run my application correctly, I need Hadoop 2.6. I could pass --hadoop-major-verison=yarn, but that will only install Hadoop 2.4.

Could anyone tell me an easy way to do this?

Answer 1

These days, it is recommended to use the AWS Command-Line Interface (CLI) .

See: AWS CLI documentation for EMR create-cluster

However, there is no combination of EMR AMI that has Spark 1.6.1 and Hadoop 2.6. The closest is emr-4.7.1 that has Spark 1.6.1 and Hadoop 2.7.2.

See: AWS EMR Releases , which shows this diagram:

在此处输入图片说明

How to spin-up Spark EC2 cluster with Hadoop 2.6

Question

1 answers

solution1
2 ACCPTED 2016-10-06 04:39:04

How to spin-up Spark EC2 cluster with Hadoop 2.6

Question

1 answers

solution1 2 ACCPTED 2016-10-06 04:39:04

solution1
2 ACCPTED 2016-10-06 04:39:04