简体   繁体   English

AWS 错误消息:使用 AWS KMS 托管密钥指定服务器端加密的请求需要 AWS 签名版本 4

[英]AWS Error Message: Requests specifying Server Side Encryption with AWS KMS managed keys require AWS Signature Version 4

I am facing the following error while writing to S3 bucket using pyspark.使用 pyspark 写入 S3 存储桶时遇到以下错误。

com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: A0B0C0000000DEF0, AWS Error Code: InvalidArgument, AWS Error Message: Requests specifying Server Side Encryption with AWS KMS managed keys require AWS Signature Version 4., com.amazonaws.services.s3.model.AmazonS3Exception:状态代码:400,AWS 服务:Amazon S3,AWS 请求 ID:A0B0C0000000DEF0,AWS 错误代码:InvalidArgument,AWS 错误消息:使用 AWS KMS 托管密钥指定服务器端加密的请求需要AWS 签名版本 4.,

I have applied server-side encryption using AWS KMS service on the S3 bucket.我在 S3 存储桶上使用 AWS KMS 服务应用了服务器端加密。 I am using the following spark-submit command -我正在使用以下 spark-submit 命令 -

spark-submit --packages com.amazonaws:aws-java-sdk-pom:1.10.34,org.apache.hadoop:hadoop-aws:2.7.2 --jars sample-jar sample_pyspark.py 

This is the sample code I am working on -这是我正在处理的示例代码 -

spark_context = SparkContext()
sql_context = SQLContext(spark_context) 
spark = SparkSession.builder.appName('abc').getOrCreate()
hadoopConf = spark_context._jsc.hadoopConfiguration()
hadoopConf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
#Have a spark dataframe 'source_data
source_data.coalesce(1).write.mode('overwrite').parquet("s3a://sample-bucket")

Note: Tried to load the spark-dataframe into s3 bucket [without server-side encryption enabled] and it was successful注意:尝试将 spark-dataframe 加载到 s3 存储桶 [未启用服务器端加密] 并成功

The error seems to be telling you to enable V4 S3 signatures on the Amazon SDK.该错误似乎告诉您在 Amazon SDK 上启用 V4 S3 签名。 One way to do it is from the command line:一种方法是从命令行:

spark-submit --conf spark.driver.extraJavaOptions='-Dcom.amazonaws.services.s3.enableV4' \
    --conf spark.executor.extraJavaOptions='-Dcom.amazonaws.services.s3.enableV4' \
    ... (other spark options)

That said, I agree with Steve that you should use a more recent hadoop library.也就是说,我同意史蒂夫的观点,你应该使用更新的 hadoop 库。

References:参考:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM