简体   繁体   English

Spark Streaming StreamingContext错误

[英]Spark Streaming StreamingContext error

I'm a Java veteran who's trying to learn Scala + Spark Streaming. 我是Java的资深人士,他正在尝试学习Scala + Spark Streaming。 I downloaded Eclipse-based Scala IDE + Spark core jar + Spark Streaming jar both 2.10 and try out the example - I'm getting the error: 我下载了基于Eclipse的Scala IDE + Spark核心jar + Spark Streaming jar 2.10并尝试了示例-我遇到了错误:

val ssc = new StreamingContext(conf, Seconds(1));

Description Resource Path Location Type bad symbolic reference. 说明资源路径位置类型错误的符号引用。 A signature in StreamingContext.class refers to term conf in package org.apache.hadoop which is not available. StreamingContext.class中的签名是指软件包org.apache.hadoop中的术语conf,它不可用。 It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling StreamingContext.class. 当前类路径中可能会完全丢失它,或者类路径上的版本可能与编译StreamingContext.class时使用的版本不兼容。 Lab.scala /AirStream/src line 10 Scala Problem Lab.scala / AirStream / src第10行Scala问题

Is there something that I missed here? 我在这里错过了什么吗? all SparkContext has no error but StreamingContext is getting this error al the time. 所有SparkContext没有错误,但StreamingContext总是收到此错误。

I ran into approximately this same issue. 我碰到了大约相同的问题。 Here is generally the scala class that I was writing for scala/spark practice: 通常,这是我为scala / spark实践编写的scala类:

package practice.spark

import org.apache.spark.SparkContext._
import org.apache.spark._
import org.apache.spark.sql._

object SparkService {
  def sparkInit(sparkInstanceConfig: Configuration): SparkService = {
    val sparkConf = new SparkConf().setAppName(sparkInstanceConfig.appName)
    val instanceSpark = new SparkService(sparkConf)
    return instanceSpark
  }
}

class SparkService(sparkConf: SparkConf) {
  val sc = new SparkContext(sparkConf)
  val sql = new org.apache.spark.sql.SQLContext(sc)
}

In my eclipse project properties>Java Build Path>Libraries I had jre8 library, scala 2.11 library, spark-core_2.11, and spark-sql_2.11. 在我的Eclipse项目属性> Java Build Path>库中,我有jre8库,scala 2.11库,spark-core_2.11和spark-sql_2.11。 I was getting the error 我遇到了错误

Description Resource Path Location Type missing or invalid dependency detected while loading class file 'SparkContext.class'. 说明资源路径位置类型在加载类文件“ SparkContext.class”时检测到缺少或无效的依赖关系。 Could not access term hadoop in package org.apache, because it (or its dependencies) are missing. 无法访问包org.apache中的条件hadoop,因为它(或其依赖项)丢失了。 Check your build definition for missing or conflicting dependencies. 检查构建定义中是否缺少依赖项或冲突的依赖项。 (Re-run with -Ylog-classpath to see the problematic classpath.) A full rebuild may help if 'SparkContext.class' was compiled against an incompatible version of org.apache. (使用-Ylog-classpath重新运行,以查看有问题的类路径。)如果针对不兼容版本的org.apache编译了“ SparkContext.class”,则完全重建可能会有所帮助。 BinAnalysisNew Unknown Scala Problem BinAnalysis新的未知Scala问题

I then added the hadoop-core jar to my Java build path and it cleared up this issue. 然后,我将hadoop-core jar添加到我的Java构建路径中,从而解决了此问题。 I used the latest version of that jar. 我使用了那个罐子的最新版本。

This issue may also be able to be cleared up by using gradle or some other build tool that will pick up all the dependencies of each jar used in the project. 通过使用gradle或其他一些构建工具(可以选择项目中使用的每个jar的所有依赖项),也可以解决此问题。

Make sure the version of hadoop on the classpath matches the one that the spark streaming jar was built against. 确保类路径上的hadoop版本与生成火花流jar的版本匹配。 There might also be some dependencies that spark streaming expects to be provided by the cluster environment; 集群环境可能还会提供火花流期望提供的某些依赖关系。 if so you will need to add them manually to the classpath when running in eclipse. 如果是这样,则在eclipse中运行时,您需要将它们手动添加到类路径中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM