简体   繁体   中英

Unable to run PySpark in Jupyter Notebook - Linux

I'm trying to run PySpark on my Jupyter Notebook locally on a server not connected to the internet. I installed PySpark and Java using the following:

conda install pyspark-3.3.0-pyhd8ed1ab_0.tar.bz2
conda install openjdk-8.0.332-h166bdaf_0.tar.bz2

When I do a !java -version in my notebook, I get

openjdk version "1.8.0_332"
OpenJDK Runtime Environment (Zulu 8.62.0.19-CA-linux64) (build 1.8.0_332-b09)
OpenJDK 64-Bit Server VM (Zulu 8.62.0.19-CA-linux64) (build 25.332-b09, mixed mode)

When I run !which java , I get

/root/anaconda3/bin/java

My code is as follows.

import os
os.environ['SPARK_HOME'] = "/root/anaconda3/pkgs/pyspark-3.3.0-pyhd8ed1ab_0/site_packages/pyspark"
os.environ['JAVA_HOME'] = "/root/anaconda3"
os.environ['PYSPARK_SUBMIT_ARGS'] = "--master local[2] pyspark-shell"

from pyspark import SparkConf, SparkContext
conf = SparkConf().set('spark.driver.host','127.0.0.1')
sc = SparkContext(master='local', appName='Test', conf=conf)

The error I got was (a snippet of it because I'm manually typing it here):

Exception in thread "main" java.lang.ExceptionInInitializerError
    at org.apache.spark.deploy.SparkSubmitArguments.$anonfun$loadEnvironmentArguments$3(SparkSubmitArguments.scala:157)
    ...
Caused by: java.net.UnknownHostException: abc: abc: Name or service not known
...
Caused by: java.net.UnknownHostException: abc: Name or service not known
...

Runtime Error: Java gateway process exited before sending its port number

"abc" is my server's hostname. What am I missing here?

I found out what the problem was.

Based on the error message java.net.UnknownHostException: abc: abc: Name or service not known , I suspected Java did not recognize my server hostname abc . So I added it to /etc/hosts under the loopback IP 127.0.0.1 , and now I can run pyspark.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM