简体   繁体   English

Spark/Hadoop/Scala/Java/Python 之间是否存在版本兼容性问题?

[英]Is there a version compatibility issue between Spark/Hadoop/Scala/Java/Python?

I'm getting an error while running spark-shell command through cmd but unfortunately without any luck so far.我在通过 cmd 运行 spark-shell 命令时遇到错误,但不幸的是到目前为止没有任何运气。 I have Python/Java/Spark/Hadoop(winutils.exe)/Scala installed with versions as below:我安装了 Python/Java/Spark/Hadoop(winutils.exe)/Scala,版本如下:

  • Python: 3.7.3蟒蛇:3.7.3
  • Java: 1.8.0_311 Java:1.8.0_311
  • Spark: 3.2.0火花:3.2.0
  • Hadoop(winutils.exe):2.5x Hadoop(winutils.exe):2.5x
  • scala sbt: sbt-1.5.5.msi Scala sbt: sbt-1.5.5.msi

I followed below steps and ran spark-shell ( C:\\Program Files\\spark-3.2.0-bin-hadoop3.2\\bin> ) through cmd:我按照以下步骤运行 spark-shell( C:\\Program Files\\spark-3.2.0-bin-hadoop3.2\\bin> )通过 cmd:

  1. Create JAVA_HOME variable: C:\\Program Files\\Java\\jdk1.8.0_311\\bin创建JAVA_HOME变量: C:\\Program Files\\Java\\jdk1.8.0_311\\bin
  2. Add the following part to your path: %JAVA_HOME%\\bin将以下部分添加到您的路径中: %JAVA_HOME%\\bin
  3. Create SPARK_HOME variable: C:\\spark-3.2.0-bin-hadoop3.2\\bin创建SPARK_HOME变量: C:\\spark-3.2.0-bin-hadoop3.2\\bin
  4. Add the following part to your path: %SPARK_HOME%\\bin将以下部分添加到您的路径中: %SPARK_HOME%\\bin
  5. The most important part Hadoop path should include bin file before winutils.exe as the following: C:\\Hadoop\\bin Sure you will locate winutils.exe inside this path. Hadoop 路径中最重要的部分应该在winutils.exe之前包含 bin 文件,如下所示: C:\\Hadoop\\bin确保您将在此路径中找到winutils.exe
  6. Create HADOOP_HOME Variable: C:\\Hadoop创建HADOOP_HOME变量: C:\\Hadoop
  7. Add the following part to your path: %HADOOP_HOME%\\bin将以下部分添加到您的路径中: %HADOOP_HOME%\\bin

Am I missing out on anything?我错过了什么吗? I've posted my question with error details in another thread ( spark-shell command throwing this error: SparkContext: Error initializing SparkContext )我在另一个线程中发布了带有错误详细信息的问题( spark-shell 命令抛出此错误:SparkContext: Error initializing SparkContext

You went the difficult way in installing everything by hand.您手动安装所有东西都走得很艰难。 You may need Scala too, be extremely vigilant with the version you are installing, from your example it seems like it's Scala 2.12.您可能也需要 Scala,对您正在安装的版本保持高度警惕,从您的示例来看,它似乎是 Scala 2.12。

But you are right: Spark is extremely demanding in term of version matching.但您是对的:Spark 在版本匹配方面要求极高。 Java 8 is good. Java 8 很好。 Java 11 is ok too, but not any more recent version. Java 11 也可以,但不是最新版本。

Alternatively, you can:或者,您可以:

  1. Try a very simple app like in https://github.com/jgperrin/net.jgp.books.spark.ch01尝试一个非常简单的应用程序,如https://github.com/jgperrin/net.jgp.books.spark.ch01
  2. Use Docker with a pre made image, and if your goal is to do Python, I would recommend an image with Jupiter and Spark preconfigured together.将 Docker 与预先制作的镜像一起使用,如果您的目标是使用 Python,我会推荐一个预先配置了 Jupiter 和 Spark 的镜像。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM