简体   繁体   中英

Error while running first Pyspark program in Jupyter

I am a beginner in Pyspark, trying to execute few lines of code in a Jupyter notebook. I have followed the instructions available(pretty old - https://changhsinlee.com/install-pyspark-windows-jupyter/ ) in the internet to configure Pyspark post installing Python-3.8.5, Java(jdk-16), spark-3.1.1-bin-hadoop2.7.

Below are the lines which got executed successfully post installation and throws exception after 'df.show()'.I have added all necessary environment variables. Please help me to resolve this.

pip install pyspark

pip install findspark

import findspark

findspark.init()

import pyspark

from pyspark.sql import SparkSession

spark=SparkSession.builder.getOrCreate()

df=spark.sql('''Hello''')

df.show() Exception

Added error in the comments section.

Note: I am a beginner in Python. Do not have java knowledge

Had to change the Java version into Java 11. It works now.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM