简体   繁体   English

spark使用不同版本的python

[英]spark using different versions of python

trying to run spark using pyspark im getting the following error:尝试使用 pyspark 运行 spark 时出现以下错误:

    An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (pi1 executor driver): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/opt/spark/python/lib/pyspark.zip/pyspark/worker.py", line 473, in main
    raise Exception(("Python in worker has different version %s than that in " +
Exception: Python in worker has different version 3.9 than that in driver 3.8, PySpark cannot run with different minor versions. Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.

spark-env.sh : spark-env.sh :

export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export PYSPARK_PYTHON=/usr/bin/python3
export PYSPARK_DRIVER_PYTHON=/usr/bin/python3

.bashrc : .bashrc :

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-arm64
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export SPARK_HOME=/opt/spark
export PATH=$PATH:$SPARK_HOME/bin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native:$LD_LIBRARY_PATH
export PYSPARK_PYTHON=/usr/bin/python3
export PYSPARK_DRIVER_PYTHON=/usr/bin/python3
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

I tried specifying "python3.8" in the environment variables but same result我尝试在环境变量中指定“python3.8”但结果相同

I don't thing I even have python3.9 :我什至没有 python3.9 :

find /usr/bin/ -name "*python*"

returning:返回:

/usr/bin/python3
/usr/bin/python3.8
/usr/bin/python3-config
/usr/bin/aarch64-linux-gnu-python3.8-config
/usr/bin/python3.8-config
/usr/bin/aarch64-linux-gnu-python3-config

my os is Ubuntu 20.04.2 LTS我的操作系统是 Ubuntu 20.04.2 LTS

The error means you have a different version of python between your driver and the workers.该错误意味着您的驱动程序和工作程序之间有不同版本的 python。 You need to fix the version by specifying it in PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON .您需要通过在PYSPARK_PYTHONPYSPARK_DRIVER_PYTHON指定来修复版本。 /usr/bin/python3 is probably an alias to a certain version of python. /usr/bin/python3可能是某个版本的 python 的别名。 replace it with its desired number eg /usr/bin/python3.8 .将其替换为所需的数字,例如/usr/bin/python3.8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM