简体   繁体   中英

CSV format is not loading in spark-shell

Using spark 1.6 I tried following code:

val diamonds = spark.read.format("csv").option("header", "true").option("inferSchema", "true").load("/got_own/com_sep_fil.csv")

which caused the error

error: not found: value spark

In Spark 1.6 shell you get sc of type SparkContext , not spark of type SparkSession , if you want to get that functionlity you will need to instantiate a SqlContext

import org.apache.spark.sql._
val spark = new SQLContext(sc)

sqlContext is implict object SQL contect which can be used to load csv file and use com.databricks.spark.csv for mentionin csv file format

val df = sqlContext.read.format("csv").option("header", "true").option("inferSchema", "true").load("data.csv")

You need to initialize instance using SQLContext(spark version<2.0) or SparkSession(spark version>=2.0) to use methods provided by Spark.

To initialize spark instance for spark version < 2.0 use:

import org.apache.spark.sql._
val spark = new SQLContext(sc)

To initialize spark instance for spark version >= 2.0 use:

val spark = new SparkConf().setAppName("SparkSessionExample").setMaster("local")

To read the csv using spark 1.6 and databricks spark-csv package: val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("data.csv")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM