简体   繁体   English

将文件写入Amazon S3

[英]Writing a file to Amazon S3

I am trying the following: 我正在尝试以下方法:

import awscala._, s3._

implicit val s3 = S3()
val bucket = s3.createBucket("acme-datascience-lab")
bucket.put("sample.txt", new java.io.File("sample.txt"))

I get the following error: 我收到以下错误:

Exception in thread "main" java.lang.NoSuchFieldError: EU_CENTRAL_1
    at awscala.Region0$.<init>(Region0.scala:27)
    at awscala.Region0$.<clinit>(Region0.scala)
    at awscala.package$.<init>(package.scala:3)
    at awscala.package$.<clinit>(package.scala)
    at awscala.s3.S3$.apply$default$2(S3.scala:18)
    at com.acme.spark.FlightDelays.HistoricalFlightDelayOutput$.main(HistoricalFlightDelayOutput.scala:164)
    at com.acme.spark.FlightDelays.HistoricalFlightDelayOutput.main(HistoricalFlightDelayOutput.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

It occurs on this line of code: 它发生在以下代码行中:

implicit val s3 = S3()

Here is the contents of my build.sbt file: 这是我的build.sbt文件的内容:

import AssemblyKeys._

assemblySettings

name := "acme-get-flight-delays"
version := "0.0.1"
scalaVersion := "2.10.5"

// additional libraries
libraryDependencies ++= Seq(
  "org.apache.spark" % "spark-core_2.10" % "1.6.0" % "provided",
  "org.apache.spark" %% "spark-sql" % "1.6.0",
  "org.apache.spark" %% "spark-hive" % "1.6.0",
  "org.scalanlp" %% "breeze" % "0.11.2",
  "org.scalanlp" %% "breeze-natives" % "0.11.2",
  "net.liftweb" %% "lift-json" % "2.5+",
  "org.apache.hadoop" % "hadoop-client" % "2.6.0",
  "org.apache.hadoop" % "hadoop-aws" % "2.6.0",
  "com.amazonaws" % "aws-java-sdk" % "1.0.002",
  "com.github.seratch" %% "awscala" % "0.5.+"
)

resolvers ++= Seq(
  "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven",
  "JBoss Repository" at "http://repository.jboss.org/nexus/content/repositories/releases/",
  "Spray Repository" at "http://repo.spray.cc/",
  "Cloudera Repository" at "https://repository.cloudera.com/artifactory/cloudera-repos/",
  "Akka Repository" at "http://repo.akka.io/releases/",
  "Twitter4J Repository" at "http://twitter4j.org/maven2/",
  "Apache HBase" at "https://repository.apache.org/content/repositories/releases",
  "Twitter Maven Repo" at "http://maven.twttr.com/",
  "scala-tools" at "https://oss.sonatype.org/content/groups/scala-tools",
  "Typesafe repository" at "http://repo.typesafe.com/typesafe/releases/",
  "Second Typesafe repo" at "http://repo.typesafe.com/typesafe/maven-releases/",
  "Mesosphere Public Repository" at "http://downloads.mesosphere.io/maven",
  Resolver.sonatypeRepo("public")
)

mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
  {
    case m if m.toLowerCase.endsWith("manifest.mf") => MergeStrategy.discard
    case m if m.startsWith("META-INF") => MergeStrategy.discard
    case PathList("javax", "servlet", xs @ _*) => MergeStrategy.first
    case PathList("org", "apache", xs @ _*) => MergeStrategy.first
    case PathList("org", "jboss", xs @ _*) => MergeStrategy.first
    case "about.html"  => MergeStrategy.rename
    case "reference.conf" => MergeStrategy.concat
    case _ => MergeStrategy.first
  }
}

// Configure JAR used with the assembly plug-in
jarName in assembly := "acme-get-flight-delays.jar"

// A special option to exclude Scala itself from our assembly JAR, since Spark
// already bundles in Scala.
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)

You can use the AWS SDK for Java directly or a wrapper like AWScala that lets you do something like 您可以直接使用适用于Java的AWS开发工具包,也可以使用类似AWScala的包装器来执行以下操作:

import awscala._, s3._

implicit val s3 = S3()
val bucket = s3.bucket("your-bucket")
bucket.put("sample.txt", new java.io.File("sample.txt"))

Add the following to build.sbt: 将以下内容添加到build.sbt中:

libraryDependencies += "com.github.seratch" %% "awscala" % "0.3.+"

The following code will do the job: 以下代码将完成此工作:

implicit val s3 = S3()
s3.setRegion(com.amazonaws.regions.Region.getRegion(com.amazonaws.regions.Regions.US_EAST_1))
val bucket = s3.bucket("acme-datascience-lab")
bucket.get.put("sample.txt", new java.io.File("sample.txt"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM