简体   繁体   中英

How to spin-up Spark EC2 cluster with Hadoop 2.6

I am trying to get Spark EC2 cluster running on Spark 1.6.1 with Hadoop 2.6

-Here what I have tried:

./spark-ec2 -i ~/.ssh/***.pem \
--instance-profile-name *** \
-k *** \
--region=us-east-1 \
--instance-type=m3.xlarge \
-s 2 \
--copy-aws-credentials \
launch test-cluster

However, this installed Hadoop 1.0. So I added the following option to the above command:

--hadoop-major-version=2 \

However, I soon realized that in order to run my application correctly, I need Hadoop 2.6. I could pass --hadoop-major-verison=yarn, but that will only install Hadoop 2.4.

Could anyone tell me an easy way to do this?

These days, it is recommended to use the AWS Command-Line Interface (CLI) .

See: AWS CLI documentation for EMR create-cluster

However, there is no combination of EMR AMI that has Spark 1.6.1 and Hadoop 2.6. The closest is emr-4.7.1 that has Spark 1.6.1 and Hadoop 2.7.2.

See: AWS EMR Releases , which shows this diagram:

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM