Is there the way to substitute this line of code to be able to execute it with PySpark version 1.6.2, not 2.0.0? The problem is that SparkSession
does not exist in Spark 1.6.2.
cfg = SparkConf().setAppName('s')
spark = SparkSession.builder.enableHiveSupport().config(conf=cfg).getOrCreate()
df = spark.createDataFrame([], schema=StructType([StructField('id', StringType()),
StructField('pk', StringType()),
StructField('le', StringType()),
StructField('or', StringType())]))
For older versions of Spark (earlier versions than 2.0), you can use HiveContext
instead of SparkSession
, see the relevant documentation . Small example of setting up the environment:
from pyspark import HiveContext
conf = SparkConf().setAppName('s')
sc = SparkContext(conf=conf)
sqlContext = HiveContext(sc)
After this you can create a dataframe in the same way as before by using the sqlContext
variable.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.