[英]How can i handle findspark.init() index error?
This is my code : 这是我的代码:
!apt-get install openjdk-8-jdk-headless -qq > /dev/null !wget -q https://www-us.apache.org/dist/spark/spark-2.4.1/spark-2.4.1-bin-hadoop2.7.tgz !tar xf spark-2.4.1-bin-hadoop2.7.tgz !pip install -q findspark
import os
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
os.environ["SPARK_HOME"] = "/content/spark-2.4.1-bin-hadoop2.7"
import findspark
findspark.init()
I have searched but i couldn't find solution,I am using findspark.init() here and it gives this error : 我已经搜索过,但找不到解决方案,我在这里使用findspark.init(),它给出了此错误:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-43-ea4df208c52b> in <module>()
1 import findspark
----> 2 findspark.init("/content/")
/usr/local/lib/python3.6/dist-packages/findspark.py in init(spark_home, python_path, edit_rc, edit_profile)
133 # add pyspark to sys.path
134 spark_python = os.path.join(spark_home, 'python')
--> 135 py4j = glob(os.path.join(spark_python, 'lib', 'py4j-*.zip'))[0]
136 sys.path[:0] = [spark_python, py4j]
137
IndexError: list index out of range
We need to use actual versions. 我们需要使用实际版本。 So the code will be like :
因此,代码将类似于:
!apt-get install openjdk-8-jdk-headless -qq > /dev/null
!wget -q https://www-us.apache.org/dist/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz
!tar xf spark-2.4.3-bin-hadoop2.7.tgz
!pip install -q findspark
import os
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
os.environ["SPARK_HOME"] = "/content/spark-2.4.3-bin-hadoop2.7"
import findspark
findspark.init()
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").getOrCreate()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.