How to create an empty dataframe in Spark 1.6.2 given an example of Spark 2.0.0?

Question

Is there the way to substitute this line of code to be able to execute it with PySpark version 1.6.2, not 2.0.0? The problem is that SparkSession does not exist in Spark 1.6.2.

cfg = SparkConf().setAppName('s')
spark = SparkSession.builder.enableHiveSupport().config(conf=cfg).getOrCreate()
df = spark.createDataFrame([], schema=StructType([StructField('id', StringType()),
                                                         StructField('pk', StringType()),
                                                         StructField('le', StringType()),
                                                         StructField('or', StringType())]))

Answer 1

For older versions of Spark (earlier versions than 2.0), you can use HiveContext instead of SparkSession , see the relevant documentation . Small example of setting up the environment:

from pyspark import HiveContext

conf = SparkConf().setAppName('s')
sc = SparkContext(conf=conf)
sqlContext = HiveContext(sc)

After this you can create a dataframe in the same way as before by using the sqlContext variable.

How to create an empty dataframe in Spark 1.6.2 given an example of Spark 2.0.0?

Question

1 answers

solution1
1 ACCPTED 2017-09-29 07:29:48

How to create an empty dataframe in Spark 1.6.2 given an example of Spark 2.0.0?

Question

1 answers

solution1 1 ACCPTED 2017-09-29 07:29:48

solution1
1 ACCPTED 2017-09-29 07:29:48