简体   繁体   中英

How to add Delta Lake support to Zeppelin's spark interpreter?

I'm trying to add the Delta Lake support to Zeppelin.

So far I've tried adding the io.delta:delta-core_2.12:0.7.0 dependency to the spark interpreter, as well as a couple other related actions within the interpreters view... but nothing has worked.

When I add the io.delta:delta-core_2.12:0.7.0 dependency, I get errors within my notebooks such as:

org.apache.zeppelin.interpreter.InterpreterException: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
    at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:76)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:668)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:577)
    at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
    at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:130)
    at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:39)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
    at org.apache.spark.util.Utils$.stringToSeq(Utils.scala:2664)
    at org.apache.spark.internal.config.ConfigHelpers$.stringToSeq(ConfigBuilder.scala:49)
    at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$toSequence$1.apply(ConfigBuilder.scala:125)
    at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$toSequence$1.apply(ConfigBuilder.scala:125)
    at org.apache.spark.internal.config.TypedConfigBuilder.createWithDefault(ConfigBuilder.scala:143)
    at org.apache.spark.internal.config.package$.<init>(package.scala:172)
    at org.apache.spark.internal.config.package$.<clinit>(package.scala)
    at org.apache.spark.SparkConf$.<init>(SparkConf.scala:716)
    at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala)
    at org.apache.spark.SparkConf.set(SparkConf.scala:95)
    at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:77)
    at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:76)
    at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)
    at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234)
    at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
    at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
    at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)
    at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:76)
    at org.apache.spark.SparkConf.<init>(SparkConf.scala:71)
    at org.apache.spark.SparkConf.<init>(SparkConf.scala:58)
    at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:80)
    at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
    ... 8 more

My goal is to read/write from/to Delta Lake tables using Scala + Spark.

Thanks!

The most probable reason for this is that you're using Delta Lake with Spark 2.x - the package that you're using is supposed to work with Spark 3.0+ (compiled with Scala 2.12). The latest version of Delta that supports 2.4 (minimum 2.4.2) is 0.6.1 (see this answer ).

So you need to upgrade Spark version if you want to use this specific package, or use another version of Delta if you want to keep you Spark installations.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM