简体   繁体   English

Spark SQL toDF方法因java.lang.NoSuchMethodError而失败

[英]Spark SQL toDF method fails with java.lang.NoSuchMethodError

Objective 目的

Understand the cause and the solution to the problem. 了解问题的原因和解决方案。 The problem happens when using spark-submit. 使用spark-submit时会出现问题。 Appreciate the help. 感谢帮助。

spark-submit --class "AuctionDataFrame" --master spark://<hostname>:7077 auction-project_2.11-1.0.jar

It does not cause an error when running line by line in a spark-shell. 在spark-shell中逐行运行时不会导致错误。

...
scala>     val auctionsDF = auctionsRDD.toDF()
auctionsDF: org.apache.spark.sql.DataFrame = [aucid: string, bid: float, bidtime: float, bidder: string, bidrate: int, openbid: float, price: float, itemtype: string, dtl: int]
scala> auctionsDF.printSchema()
root
 |-- aucid: string (nullable = true)
 |-- bid: float (nullable = false)
 |-- bidtime: float (nullable = false)
 |-- bidder: string (nullable = true)
 |-- bidrate: integer (nullable = false)
 |-- openbid: float (nullable = false)
 |-- price: float (nullable = false)
 |-- itemtype: string (nullable = true)
 |-- dtl: integer (nullable = false)

Problem 问题

Calling toDF method to convert RDD into DataFrame causes the error. 调用toDF方法将RDD转换为DataFrame会导致错误。

Exception in thread "main" java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
    at AuctionDataFrame$.main(AuctionDataFrame.scala:52)
    at AuctionDataFrame.main(AuctionDataFrame.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Code

case class Auctions(
  aucid: String,
  bid: Float,
  bidtime: Float,
  bidder: String,
  bidrate: Int,
  openbid: Float,
  price: Float,
  itemtype: String,
  dtl: Int)

object AuctionDataFrame {
  val AUCID = 0
  val BID = 1
  val BIDTIME = 2
  val BIDDER = 3
  val BIDRATE = 4
  val OPENBID = 5
  val PRICE = 6
  val ITEMTYPE = 7
  val DTL = 8

  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("AuctionDataFrame")
    val sc = new SparkContext(conf)
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    import sqlContext.implicits._

    val inputRDD = sc.textFile("/user/wynadmin/auctiondata.csv").map(_.split(","))
    val auctionsRDD = inputRDD.map(a =>
      Auctions(
        a(AUCID),
        a(BID).toFloat,
        a(BIDTIME).toFloat,
        a(BIDDER),
        a(BIDRATE).toInt,
        a(OPENBID).toFloat,
        a(PRICE).toFloat,
        a(ITEMTYPE),
        a(DTL).toInt))
    val auctionsDF = auctionsRDD.toDF()  // <--- line 52 causing the error.
}

build.sbt build.sbt

name := "Auction Project"

version := "1.0"

scalaVersion := "2.11.8"
//scalaVersion := "2.10.6"

/* 
libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % "1.6.2",
    "org.apache.spark" %% "spark-sql" % "1.6.2",
    "org.apache.spark" %% "spark-mllib" % "1.6.2"
)
*/

libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % "1.6.2" % "provided",
    "org.apache.spark" %% "spark-sql" % "1.6.2" % "provided",
    "org.apache.spark" %% "spark-mllib" % "1.6.2" % "provided"
)

Environment 环境

Spark on Ubuntu 14.04: 在Ubuntu 14.04上Spark:

      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.2
      /_/

Using Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_92)

sbt on Windows: Windows上的sbt:

D:\>sbt sbtVersion
[info] Set current project to root (in build file:/D:/)
[info] 0.13.12

Research 研究

Looked into similar issues which suggest Scala version incompatibility that compiled Spark. 看看类似的问题,这些问题表明编译Spark的Scala版本不兼容。

Hence changed the Scala version in build.sbt to 2.10 which created 2.10 jar, but the error persisted. 因此将build.sbt中的Scala版本更改为2.10,创建了2.10 jar,但错误仍然存​​在。 Using % provided or not does not change the error. 使用%提供与否不会更改错误。

scalaVersion := "2.10.6"

Cause 原因

The Spark 1.6.2 was compiled from source files with Scala 2.11. Spark 1.6.2是使用Scala 2.11从源文件编译的。 However the spark-1.6.2-bin-without-hadoop.tgz was downloaded and placed in lib/ directory. 然而,下载了spark-1.6.2-bin-without-hadoop.tgz并将其放在lib /目录中。

I believe because the spark-1.6.2-bin-without-hadoop.tgz has been compiled with Scala 2.10, it cause the compatibility issue. 我相信因为spark-1.6.2-bin-without-hadoop.tgz已经使用Scala 2.10编译,它会导致兼容性问题。

Fix 固定

Remove the spark-1.6.2-bin-without-hadoop.tgz from the lib directory and run "sbt package" with library dependencies below. 从lib目录中删除spark-1.6.2-bin-without-hadoop.tgz,并在下面运行带有库依赖项的“sbt package”。

libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % "1.6.2" % "provided",
    "org.apache.spark" %% "spark-sql" % "1.6.2" % "provided",
    "org.apache.spark" %% "spark-mllib" % "1.6.2" % "provided"
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 java.lang.NoSuchMethodError Jackson 数据绑定和 Spark - java.lang.NoSuchMethodError Jackson databind and Spark Apache Spark:java.lang.NoSuchMethodError .rddToPairRDDFunctions - Apache Spark: java.lang.NoSuchMethodError .rddToPairRDDFunctions Spark Streaming - java.lang.NoSuchMethodError错误 - Spark Streaming - java.lang.NoSuchMethodError Error 火花提交中的 java.lang.NoSuchMethodError - java.lang.NoSuchMethodError in spark-submit java.lang.NoSuchMethodError: org.apache.spark.sql.internal.SQLConf.useDeprecatedKafkaOffsetFetching()Z - java.lang.NoSuchMethodError: org.apache.spark.sql.internal.SQLConf.useDeprecatedKafkaOffsetFetching()Z Spark 1.5.1 + Scala 2.10 + Kafka + Cassandra = Java.lang.NoSuchMethodError: - Spark 1.5.1 + Scala 2.10 + Kafka + Cassandra = Java.lang.NoSuchMethodError: 使用spark-cassandra-connector时出错:java.lang.NoSuchMethodError - Error in using spark-cassandra-connector: java.lang.NoSuchMethodError java.lang.NoSuchMethodError:具有Excel错误的Spark数据框-Scala - java.lang.NoSuchMethodError: Spark dataframe with excel error - Scala Apache Spark - java.lang.NoSuchMethodError:breeze.linalg.DenseVector - Apache Spark - java.lang.NoSuchMethodError: breeze.linalg.DenseVector java.lang.NoSuchMethodError:在纱线簇上火花提交时 - java.lang.NoSuchMethodError: when spark-submit on yarn cluster
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM