[英]spark using different versions of python
trying to run spark using pyspark im getting the following error:尝试使用 pyspark 运行 spark 时出现以下错误:
An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (pi1 executor driver): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "/opt/spark/python/lib/pyspark.zip/pyspark/worker.py", line 473, in main
raise Exception(("Python in worker has different version %s than that in " +
Exception: Python in worker has different version 3.9 than that in driver 3.8, PySpark cannot run with different minor versions. Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.
spark-env.sh : spark-env.sh :
export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PYSPARK_PYTHON=/usr/bin/python3
export PYSPARK_DRIVER_PYTHON=/usr/bin/python3
.bashrc : .bashrc :
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-arm64
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export SPARK_HOME=/opt/spark
export PATH=$PATH:$SPARK_HOME/bin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native:$LD_LIBRARY_PATH
export PYSPARK_PYTHON=/usr/bin/python3
export PYSPARK_DRIVER_PYTHON=/usr/bin/python3
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
I tried specifying "python3.8" in the environment variables but same result我尝试在环境变量中指定“python3.8”但结果相同
I don't thing I even have python3.9 :我什至没有 python3.9 :
find /usr/bin/ -name "*python*"
returning:返回:
/usr/bin/python3
/usr/bin/python3.8
/usr/bin/python3-config
/usr/bin/aarch64-linux-gnu-python3.8-config
/usr/bin/python3.8-config
/usr/bin/aarch64-linux-gnu-python3-config
my os is Ubuntu 20.04.2 LTS我的操作系统是 Ubuntu 20.04.2 LTS
The error means you have a different version of python between your driver and the workers.该错误意味着您的驱动程序和工作程序之间有不同版本的 python。 You need to fix the version by specifying it in
PYSPARK_PYTHON
and PYSPARK_DRIVER_PYTHON
.您需要通过在
PYSPARK_PYTHON
和PYSPARK_DRIVER_PYTHON
指定来修复版本。 /usr/bin/python3
is probably an alias to a certain version of python. /usr/bin/python3
可能是某个版本的 python 的别名。 replace it with its desired number eg /usr/bin/python3.8
.将其替换为所需的数字,例如
/usr/bin/python3.8
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.