![](/img/trans.png)
[英]I cannot connect to S3 using PySpark from local machine (Class org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider not found)
[英]Class org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider not found when trying to write data on S3 bucket from Spark
我正在尝试从本地计算机将数据写入 S3 存储桶:
spark = SparkSession.builder \
.appName('application') \
.config("spark.hadoop.fs.s3a.access.key", configuration.AWS_ACCESS_KEY_ID) \
.config("spark.hadoop.fs.s3a.secret.key", configuration.AWS_ACCESS_SECRET_KEY) \
.config("spark.hadoop.fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem") \
.getOrCreate()
lines = spark.readStream \
.format('kafka') \
.option('kafka.bootstrap.servers', kafka_server) \
.option('subscribe', kafka_topic) \
.option("startingOffsets", "earliest") \
.load()
streaming_query = lines.writeStream \
.format('parquet') \
.outputMode('append') \
.option('path', configuration.S3_PATH) \
.start()
streaming_query.awaitTermination()
Hadoop版本:3.2.1,Spark版本3.2.1
我已将依赖项 jar 添加到 pyspark jar 中:
spark-sql-kafka-0-10_2.12:3.2.1, aws-java-sdk-s3:1.11.375, hadoop-aws:3.2.1,
执行时出现以下错误:
py4j.protocol.Py4JJavaError: An error occurred while calling o68.start.
: java.io.IOException: From option fs.s3a.aws.credentials.provider
java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider not found
就我而言,它最终通过添加以下语句起作用: .config('spark.hadoop.fs.s3a.aws.credentials.provider', 'org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider')
此外,site-package/pyspark/jars 中的所有 hadoop jar 必须是同一版本,hadoop-aws:3.2.2、hadoop-client-api-3.2.2、hadoop-client-runtime-3.2.2、hadoop -yam-server-web-proxy-3.2.2
对于 hadoop-aws 3.2.2 版本,需要 aws-java-sdk-s3:1.11.563 包。
我也用 guava-23.0.jar 替换了 guava-14.0.jar。
我和你用了同样的包。 就我而言,当我在该行下方添加时。
config('spark.hadoop.fs.s3a.aws.credentials.provider', 'org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider')
我得到了这个错误。
py4j.protocol.Py4JJavaError: An error occurred while calling o56.parquet.
: java.lang.NoSuchMethodError: 'void com.google.common.base.Preconditions.checkArgument(boolean, java.lang.String, java.lang.Object, java.lang.Object)'
....
为了解决这个问题,我安装了`guava-30.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.