简体   繁体   中英

Schema for type org.apache.spark.sql.types.DataType is not supported

I try to create empty df with schema:

  val sparkConf = new SparkConf()
    .setAppName("app")
    .setMaster("local")

  val sparkSession = SparkSession
    .builder()
    .config(sparkConf)
    .getOrCreate()

  val sparkContext = sparkSession.sparkContext

  var tmpScheme = StructType(
    StructField("source_id", StringType, true) :: Nil)

var df = conf.SparkConf.sparkSession.createDataFrame(tmpScheme)

and got Schema for type org.apache.spark.sql.types.DataType is not supported...

I don't understand why - there is no .DataType even in Imports:

import org.apache.spark.sql.types.{BooleanType, IntegerType, StringType, StructField, StructType}

What can be the problem here?

PS: spark version

  "org.apache.spark" %% "spark-sql" % "3.2.2", // spark
  "org.apache.spark" %% "spark-core" % "3.2.2", // spark

If you check the documentation, you can see that the argument fields of StructType is of type Array[StructField] and you are passing StructField .

This means that you should wrap your StructField with Array , for example:

val simpleSchema = StructType(Array(
  StructField("source_id", StringType, true))
)

Good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM