简体   繁体   中英

Using Scala 2.12 with Spark 2.x

At the Spark 2.1 docs it's mentioned that

Spark runs on Java 7+, Python 2.6+/3.4+ and R 3.1+. For the Scala API, Spark 2.1.0 uses Scala 2.11. You will need to use a compatible Scala version (2.11.x).

at the Scala 2.12 release news it's also mentioned that:

Although Scala 2.11 and 2.12 are mostly source compatible to facilitate cross-building, they are not binary compatible. This allows us to keep improving the Scala compiler and standard library.

But when I build an uber jar (using Scala 2.12) and run it on Spark 2.1. every thing work just fine.

and I know its not any official source but at the 47 degree blog they mentioned that Spark 2.1 does support Scala 2.12.

How can one explain those (conflicts?) pieces of information?

Spark does not support Scala 2.12. You can follow SPARK-14220 ( Build and test Spark against Scala 2.12 ) to get up to date status.

update : Spark 2.4 added an experimental Scala 2.12 support.

Scala 2.12 is officially supported (and required) as of Spark 3. Summary:

  • Spark 2.0 - 2.3: Required Scala 2.11
  • Spark 2.4: Supported Scala 2.11 and Scala 2.12, but not really cause almost all runtimes only supported Scala 2.11.
  • Spark 3: Only Scala 2.12 is supported

Using a Spark runtime that's compiled with one Scala version and a JAR file that's compiled with another Scala version is dangerous and causes strange bugs. For example, as noted here , using a Scala 2.11 compiled JAR on a Spark 3 cluster will cause this error: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps .

Look at all the poor Spark users running into this very error.

Make sure to look into Scala cross compilation and understand the %% operator in SBT to limit your suffering. Maintaining Scala projects is hard and minimizing your dependencies is recommended.

To add to the answer, I believe it is a typo https://spark.apache.org/releases/spark-release-2-0-0.html has no mention of scala 2.12.

Also, if we look at timings Scala 2.12 was not released untill November 2016 and Spark 2.0.0 was released on July 2016.

References: https://spark.apache.org/news/index.html

www.scala-lang.org/news/2.12.0/

     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.2.0
      /_/

Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 1.8.0_312)

For the Scala API, Spark 3.2.0 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x)


https://spark.apache.org/docs/latest/#:~:text=For%20the%20Scala%20API%2C%20Spark%203.2.0%20uses%20Scala%202.12.%20You%20will%20need%20to%20use%20a%20compatible%20Scala%20version%20(2.12.x)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM