简体   繁体   中英

Spark Streaming StreamingContext error

I'm a Java veteran who's trying to learn Scala + Spark Streaming. I downloaded Eclipse-based Scala IDE + Spark core jar + Spark Streaming jar both 2.10 and try out the example - I'm getting the error:

val ssc = new StreamingContext(conf, Seconds(1));

Description Resource Path Location Type bad symbolic reference. A signature in StreamingContext.class refers to term conf in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling StreamingContext.class. Lab.scala /AirStream/src line 10 Scala Problem

Is there something that I missed here? all SparkContext has no error but StreamingContext is getting this error al the time.

I ran into approximately this same issue. Here is generally the scala class that I was writing for scala/spark practice:

package practice.spark

import org.apache.spark.SparkContext._
import org.apache.spark._
import org.apache.spark.sql._

object SparkService {
  def sparkInit(sparkInstanceConfig: Configuration): SparkService = {
    val sparkConf = new SparkConf().setAppName(sparkInstanceConfig.appName)
    val instanceSpark = new SparkService(sparkConf)
    return instanceSpark
  }
}

class SparkService(sparkConf: SparkConf) {
  val sc = new SparkContext(sparkConf)
  val sql = new org.apache.spark.sql.SQLContext(sc)
}

In my eclipse project properties>Java Build Path>Libraries I had jre8 library, scala 2.11 library, spark-core_2.11, and spark-sql_2.11. I was getting the error

Description Resource Path Location Type missing or invalid dependency detected while loading class file 'SparkContext.class'. Could not access term hadoop in package org.apache, because it (or its dependencies) are missing. Check your build definition for missing or conflicting dependencies. (Re-run with -Ylog-classpath to see the problematic classpath.) A full rebuild may help if 'SparkContext.class' was compiled against an incompatible version of org.apache. BinAnalysisNew Unknown Scala Problem

I then added the hadoop-core jar to my Java build path and it cleared up this issue. I used the latest version of that jar.

This issue may also be able to be cleared up by using gradle or some other build tool that will pick up all the dependencies of each jar used in the project.

Make sure the version of hadoop on the classpath matches the one that the spark streaming jar was built against. There might also be some dependencies that spark streaming expects to be provided by the cluster environment; if so you will need to add them manually to the classpath when running in eclipse.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM