简体   繁体   中英

java.lang.NoSuchMethodError: org.apache.spark.storage.BlockManager

I am getting the following error message while I am connecting to a kinesis stream.

java.lang.NoSuchMethodError: org.apache.spark.storage.BlockManager.get(Lorg/apache/spark/storage/BlockId;)Lscala/Option;
    at org.apache.spark.streaming.kinesis.KinesisBackedBlockRDD.getBlockFromBlockManager$1(KinesisBackedBlockRDD.scala:104)

My spark streaming code is,

sc = SparkContext(appName="PythonStreamingTest")
ssc = StreamingContext(sc, 10)
dstream = KinesisUtils.createStream(
    ssc, "PythonStreamingTest", "questions", "https://kinesis.us-west-2.amazonaws.com", "us-west-2", InitialPositionInStream.TRIM_HORIZON, 1)
dstream.foreachRDD(stream_rdd)

def stream_rdd(rdd):
    if not rdd.isEmpty():
        return rdd.foreach(classify)

def classify(ele):
    if ele!="":
        print ele

Initially, the stream comes blank as it takes a while to connect to the Kinesis stream. But then all of a sudden, it breaks down the code. The rest of the trace is,

17/04/02 17:52:00 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.lang.NoSuchMethodError: org.apache.spark.storage.BlockManager.get(Lorg/apache/spark/storage/BlockId;)Lscala/Option;
    at org.apache.spark.streaming.kinesis.KinesisBackedBlockRDD.getBlockFromBlockManager$1(KinesisBackedBlockRDD.scala:104)
    at org.apache.spark.streaming.kinesis.KinesisBackedBlockRDD.compute(KinesisBackedBlockRDD.scala:117)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:99)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

I submit my job using the following command,

spark-submit --jars spark-streaming-kinesis-asl-assembly_2.11-2.0.0.jar --driver-memory 5g Question_Type_Classification_testing_purpose/classifier_streaming.py

I am running the code on a local machine. So if I am giving 5g of memory, the executor should work fine. The same code works for Spark 1.6. Recently I changed to Spark 2.1 and I am not able to run this code. I updated my kinesis jar and Py4j as well.

I tested my code by writing a Kinesis consumer, and it gets the stream perfectly fine.

Can anyone please let me know what can be the possible issue? Is the empty stream creating an issue? If yes, why am I getting an empty stream while using Spark streaming? Any help is really appreciated.

spark-streaming-kinesis-asl is Spark's own internal library and is using Spark internal APIs (eg, BlockManager.get). The method signature of BlockManager.get was changed in https://github.com/apache/spark/commit/29cfab3f1524c5690be675d24dda0a9a1806d6ff#diff-2b643ea78c1add0381754b1f47eec132L605 so you will see NoSuchMethodError if the Spark version is >= 2.0.1 but spark-streaming-kinesis-asl version is < 2.0.1.

Generally, because Spark doesn't promise not breaking internal APIs between releases, you must use spark-streaming-kinesis-asl with the same version of Spark.

For latest Spark releases, the kinesis asl assembly jar was removed because of the potential license issue [1], hence you may not be able to find the assembly jar. However, you can use --packages org.apache.spark:spark-streaming-kinesis-asl_2.11:2.1.0 to add spark-streaming-kinesis-asl and its dependencies into the classpath automatically, rather than building the assembly jar by yourself.

[1] https://issues.apache.org/jira/browse/SPARK-17418

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM