[英]Spark 2.4 - dataframe write into s3 bucket
從我的本地 PC 中,我嘗試將我的 DF 加載到 S3 中。下面是我的代碼片段。
sparkContext.hadoopConfiguration.set("fs.s3a.awsAccessKeyId", Util.AWS_ACCESS_KEY)
sparkContext.hadoopConfiguration.set("fs.s3a.awsSecretAccessKey", Util.AWS_SECRET_ACCESS_KEY)
sparkContext.hadoopConfiguration.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
empTableDF.coalesce(1).write
.format("csv")
.option("header", "true")
.mode(SaveMode.Overwrite)
.save("s3a://welpocstg/")
運行時我遇到異常
com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
我的 pom.xml
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.7</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.7</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>2.7.7</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient -->
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.6</version>
</dependency>
您可以嘗試以下更改。
sparkContext.hadoopConfiguration.set("fs.s3a.access.key", Util.AWS_ACCESS_KEY)
sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", Util.AWS_SECRET_ACCESS_KEY)
Seq("1","2","3").toDF("id")
.coalesce(1)
.write
.format("csv")
.option("header", "true")
.mode(SaveMode.Overwrite)
.save("s3a://welpocstg/")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.