简体   繁体   中英

How to create an empty dataframe in Spark 1.6.2 given an example of Spark 2.0.0?

Is there the way to substitute this line of code to be able to execute it with PySpark version 1.6.2, not 2.0.0? The problem is that SparkSession does not exist in Spark 1.6.2.

cfg = SparkConf().setAppName('s')
spark = SparkSession.builder.enableHiveSupport().config(conf=cfg).getOrCreate()
df = spark.createDataFrame([], schema=StructType([StructField('id', StringType()),
                                                         StructField('pk', StringType()),
                                                         StructField('le', StringType()),
                                                         StructField('or', StringType())]))

For older versions of Spark (earlier versions than 2.0), you can use HiveContext instead of SparkSession , see the relevant documentation . Small example of setting up the environment:

from pyspark import HiveContext

conf = SparkConf().setAppName('s')
sc = SparkContext(conf=conf)
sqlContext = HiveContext(sc)

After this you can create a dataframe in the same way as before by using the sqlContext variable.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM