简体   繁体   English

AWS EMR和Spark 1.0.0

[英]AWS EMR and Spark 1.0.0

I've been running into some issues recently while trying to use Spark on an AWS EMR cluster. 在尝试在AWS EMR集群上使用Spark时,我最近遇到了一些问题。

I am creating the cluster using something like : 我正在使用以下内容创建集群:

./elastic-mapreduce --create --alive \
--name "ll_Spark_Cluster" \
--bootstrap-action s3://elasticmapreduce/samples/spark/1.0.0/install-spark-shark.rb \
--bootstrap-name "Spark/Shark" \
--instance-type m1.xlarge \
--instance-count 2 \
--ami-version 3.0.4

The issue is that whenever I try to get data from S3 I get an exception. 问题是每当我尝试从S3获取数据时,我都会遇到异常。 So if I start the spark-shell and try something like : 所以,如果我启动spark-shell并尝试类似:

val data = sc.textFile("s3n://your_s3_data")

I get the following exception : 我得到以下异常:

WARN storage.BlockManager: Putting block broadcast_1 failed
java.lang.NoSuchMethodError:
com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;

This issue was caused by the guava library, 这个问题是由番石榴图书馆引起的,

The version that's on the AMI is 11 while spark needs version 14. AMI上的版本是11,而spark需要版本14。

I edited the bootstrap script from AWS to install spark 1.0.2 and update the guava library during the bootstrap action you can get the gist here : 我从AWS编辑了bootstrap脚本以安装spark 1.0.2并在引导操作期间更新guava库,您可以在此处获取要点:

https://gist.github.com/tnbredillet/867111b8e1e600fa588e https://gist.github.com/tnbredillet/867111b8e1e600fa588e

Even after updating guava I still had an issue. 即使在更新番石榴之后,我仍然遇到了问题。 When I tried to save data on S3 I had an exception thrown 当我试图在S3上保存数据时,我抛出异常

lzo.GPLNativeCodeLoader - Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path

I solved that by adding the hadoop native library to the java.library.path. 我通过将hadoop本机库添加到java.library.path来解决这个问题。 When I run a job I add the option 当我开始工作时,我添加了选项

 -Djava.library.path=/home/hadoop/lib/native 

or if I run a job through spark-submit I add the 或者,如果我通过spark-submit运行工作,我添加

--driver-library-path /home/hadoop/lib/native 

argument. 论点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM