不支持类型 org.apache.spark.sql.types.DataType 的模式

Question

I try to create empty df with schema:我尝试使用架构创建空 df：

  val sparkConf = new SparkConf()
    .setAppName("app")
    .setMaster("local")

  val sparkSession = SparkSession
    .builder()
    .config(sparkConf)
    .getOrCreate()

  val sparkContext = sparkSession.sparkContext

  var tmpScheme = StructType(
    StructField("source_id", StringType, true) :: Nil)

var df = conf.SparkConf.sparkSession.createDataFrame(tmpScheme)

and got Schema for type org.apache.spark.sql.types.DataType is not supported...并得到Schema for type org.apache.spark.sql.types.DataType is not supported...

I don't understand why - there is no .DataType even in Imports:我不明白为什么 - 即使在 Imports 中也没有.DataType ：

import org.apache.spark.sql.types.{BooleanType, IntegerType, StringType, StructField, StructType}

What can be the problem here?这可能是什么问题？

PS: spark version PS：火花版

  "org.apache.spark" %% "spark-sql" % "3.2.2", // spark
  "org.apache.spark" %% "spark-core" % "3.2.2", // spark

Answer 1

If you check the documentation, you can see that the argument fields of StructType is of type Array[StructField] and you are passing StructField .如果您查看文档，您可以看到StructType的参数fields是Array[StructField]类型并且您正在传递StructField 。

This means that you should wrap your StructField with Array , for example:这意味着您应该用Array包装您的StructField ，例如：

val simpleSchema = StructType(Array(
  StructField("source_id", StringType, true))
)

Good luck!祝你好运！

不支持类型 org.apache.spark.sql.types.DataType 的模式

问题描述

1 个解决方案

解决方案1
2 2022-12-26 13:51:22

不支持类型 org.apache.spark.sql.types.DataType 的模式

问题描述

1 个解决方案

解决方案1 2 2022-12-26 13:51:22

解决方案1
2 2022-12-26 13:51:22