简体   繁体   中英

Pyspark - Error related to SparkContext - no attribute _jsc

Unsure of what the issue is with this. I've seen similar issues regarding this problem, but nothing that solves my issue. Full Error,

Traceback (most recent call last):
  File "C:/Users/computer/PycharmProjects/spark_test/spark_test/test.py", line 4, in <module>
    sqlcontext = SQLContext(sc)
  File "C:\Users\computer\AppData\Local\Programs\Python\Python36\lib\site-packages\pyspark\sql\context.py", line 74, in __init__
    self._jsc = self._sc._jsc
AttributeError: type object 'SparkContext' has no attribute '_jsc'

Here is the simple code I am trying to run:

from pyspark import SQLContext
from pyspark.context import SparkContext as sc

sqlcontext = SQLContext(sc)

df = sqlcontext.read.json('random.json')

If you are using Spark Shell, you will notice that SparkContext is already created.

Otherwise, you can create the SparkContext by importing, initializing and providing the configuration settings. In your case you only passed the SparkContext to SQLContext

import pyspark

conf = pyspark.SparkConf()
# conf.set('spark.app.name', app_name) # Optional configurations

# init & return
sc = pyspark.SparkContext.getOrCreate(conf=conf)
sqlcontext = SQLContext(sc)

df = sqlcontext.read.json('random.json')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM