简体   繁体   English

Spark MLlib示例,NoSuchMethodError:org.apache.spark.sql.SQLContext.createDataFrame()

[英]Spark MLlib example, NoSuchMethodError: org.apache.spark.sql.SQLContext.createDataFrame()

I'm following the documentation example Example: Estimator, Transformer, and Param 我正在跟踪文档示例, 例如:Estimator,Transformer和Param

And I got error msg 我收到错误信息

15/09/23 11:46:51 INFO BlockManagerMaster: Registered BlockManager Exception in thread "main" java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror; 15/09/23 11:46:51信息BlockManagerMaster:“主”线程中注册的BlockManager异常java.lang.NoSuchMethodError:scala.reflect.api.JavaUniverse.runtimeMirror(Ljava / lang / ClassLoader;)Lscala / reflect / api / JavaUniverse $ JavaMirror; at SimpleApp$.main(hw.scala:75) 在SimpleApp $ .main(hw.scala:75)

And line 75 is the code "sqlContext.createDataFrame()": 第75行是代码“ sqlContext.createDataFrame()”:

import java.util.Random

import org.apache.log4j.Logger
import org.apache.log4j.Level

import scala.io.Source

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.rdd._


import org.apache.spark.ml.classification.LogisticRegression
import org.apache.spark.ml.param.ParamMap
import org.apache.spark.mllib.linalg.{Vector, Vectors}
import org.apache.spark.mllib.recommendation.{ALS, Rating, MatrixFactorizationModel}
import org.apache.spark.sql.Row
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions._

object SimpleApp {
     def main(args: Array[String]) {
       val conf = new SparkConf().setAppName("Simple Application").setMaster("local[4]");
       val sc = new SparkContext(conf)
       val sqlContext = new SQLContext(sc)
       val training = sqlContext.createDataFrame(Seq(
         (1.0, Vectors.dense(0.0, 1.1, 0.1)),
         (0.0, Vectors.dense(2.0, 1.0, -1.0)),
         (0.0, Vectors.dense(2.0, 1.3, 1.0)),
         (1.0, Vectors.dense(0.0, 1.2, -0.5))
       )).toDF("label", "features")
    }
}

And my sbt is like below: 我的sbt如下所示:

lazy val root = (project in file(".")).
  settings(
    name := "hello",
    version := "1.0",
    scalaVersion := "2.11.4"
  )

libraryDependencies ++= {
    Seq(
        "org.apache.spark" %% "spark-core" % "1.4.1" % "provided",
        "org.apache.spark" %% "spark-sql" % "1.4.1" % "provided",
        "org.apache.spark" % "spark-hive_2.11" % "1.4.1",
        "org.apache.spark"  % "spark-mllib_2.11" % "1.4.1" % "provided",
        "org.apache.spark" %% "spark-streaming" % "1.4.1" % "provided",
        "org.apache.spark" %% "spark-streaming-kinesis-asl" % "1.4.1" % "provided"
    )
}

I tried to search around and found this post which is very similar to my issue, and I tried to change my sbt setting for spark versions (spark-mllib_2.11 to 2.10, and spark-1.4.1 to 1.5.0), but it came even more dependency conflicts. 我试图四处搜索,发现这篇文章与我的问题非常相似,并且我尝试将我的sbt设置更改为spark版本(将spark-mllib_2.11更改为2.10,将spark-1.4.1更改为1.5.0),但是它带来了更多的依赖冲突。

My intuition is it's some version problem but cannot figure it out myself, could anyone please help? 我的直觉是这是某个版本问题,但我自己无法解决,有人可以帮忙吗? thanks a lot. 非常感谢。

It's working now for me, and just for the record, referencing @MartinSenne answer. 它现在对我有用,仅作记录用途,引用@MartinSenne答案。

what I did is as below: 我所做的如下:

  1. clear all compile files under folder "project" 清除文件夹“项目”下的所有编译文件
  2. scala version 2.10.4 (previously using 2.11.4) Scala版本2.10.4 (以前使用2.11.4)
  3. change spark-sql to be: " org.apache.spark" %% "spark-sql" % "1.4.1" % "provided" 将spark-sql更改为:“ org.apache.spark” %%“ spark-sql”%“ 1.4.1”%“提供”
  4. change MLlib to be: "org.apache.spark" %% "spark-mllib" % "1.4.1" % "provided" 更改MLlib为: “ org.apache.spark” %%“ spark-mllib”%“ 1.4.1”%“提供”

@note: @注意:

  1. I've already started a Spark cluster and I use " sh spark-submit /path_to_folder/hello/target/scala-2.10/hello_2.10-1.0.jar " to submit jar to Spark master. 我已经启动了Spark集群,并使用“ sh spark-submit /path_to_folder/hello/target/scala-2.10/hello_2.10-1.0.jar ”将jar提交给Spark主服务器。 If use sbt to run by command " sbt run " will fail. 如果使用sbt通过命令“ sbt run运行将失败。
  2. when changing from scala-2.11 to scala-2.10, remember that the jar file path and name will also change from " scala-2.11/hello_2.11-1.0.jar " to " scala-2.10/hello_2.10-1.0.jar ". 从scala-2.11更改为scala-2.10时,请记住jar文件的路径和名称也将从“ scala-2.11 / hello_2.11-1.0.jar ”更改为“ scala-2.10 / hello_2.10-1.0.jar ” 。 when I re-packaged everything, I forgot to change the submit job command for the jar name, so I package into "hello_2.10-1.0.jar" but submitting "hello_2.10-1.0.jar" which caused me extra problem... 当我重新打包所有内容时,我忘记更改jar名称的Submit job命令,因此我将其打包为“ hello_2.10-1.0.jar”,但提交了“ hello_2.10-1.0.jar”,这给我带来了额外的问题。 ..
  3. I tried both "val sqlContext = new org.apache.spark.sql.SQLContext(sc) " and "val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) ", both are working with method createDataFrame() 我尝试了“ val sqlContext = new org.apache.spark.sql.SQLContext(sc) ”和“ val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) ”,两者都使用方法createDataFrame( )

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spark 1.3.1 SQL Lib:线程“ main”中的异常java.lang.NoSuchMethodError:org.apache.spark.sql.SQLContext.implicits() - Spark 1.3.1 SQL Lib: Exception in thread “main” java.lang.NoSuchMethodError: org.apache.spark.sql.SQLContext.implicits() Spark-SQL:值隐式不是org.apache.spark.sql.SQLContext的成员 - Spark - SQL : value implicits is not a member of org.apache.spark.sql.SQLContext 将RDD [org.apache.spark.sql.Row]转换为RDD [org.apache.spark.mllib.linalg.Vector] - Converting RDD[org.apache.spark.sql.Row] to RDD[org.apache.spark.mllib.linalg.Vector] NoSuchMethodError:org.apache.spark.sql.kafka010.consumer - NoSuchMethodError: org.apache.spark.sql.kafka010.consumer 如何将 RDD[org.apache.spark.sql.Row] 转换为 RDD[org.apache.spark.mllib.linalg.Vector] - How to convert RDD[org.apache.spark.sql.Row] to RDD[org.apache.spark.mllib.linalg.Vector] 44:错误:读取的值不是对象org.apache.spark.sql.SQLContext的成员 - 44: error: value read is not a member of object org.apache.spark.sql.SQLContext 在Scala的Eclipse中org.apache.spark.sql.SQLContext需要哪个jar - which jar needed for org.apache.spark.sql.SQLContext in eclipse for scala Spark SQL SQLContext - Spark sql SQLContext NoSuchMethodError:org.apache.spark.internal.Logging - NoSuchMethodError: org.apache.spark.internal.Logging scala.collection.immutable.Iterable [org.apache.spark.sql.Row]到DataFrame吗? 错误:方法值重载createDataFrame及其它替代方法 - scala.collection.immutable.Iterable[org.apache.spark.sql.Row] to DataFrame ? error: overloaded method value createDataFrame with alternatives
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM