简体   繁体   中英

How can i use Spark Context?

I assign the value as sc = pyspark.SparkContext(). It run and doesnt respond for so long on jupyter notebook as asteric sign appears and doesnt show any error or so.

I tried sc = SparkContext()

import pyspark
import os
from pyspark import SparkContext, SparkConf
sc = pyspark.SparkContext()  # At this part it don't respond
from pyspark.sql import SQLContext
sqlc = SQLContext(sc)

It should go on.

For Python,

from pyspark import SparkContext
sc = SparkContext(appName = "test")

But since you're working on pyspark version 2+ , you dont need to initialize spark context. You can create a spark session and directly work on it.

SPARK 2.0.0 onwards, SparkSession provides a single point of entry to interact with underlying Spark functionality and allows programming Spark with DataFrame and Dataset APIs. All the functionality available with sparkContext are also available in sparkSession.

In order to use APIs of SQL, HIVE, and Streaming, no need to create separate contexts as sparkSession includes all the APIs.

To configure a spark session,

session = SparkSession.builder.getOrCreate()

尝试以下导入: from pyspark import *之后您可以像这样使用它:

sc = SparkContext()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM