简体   繁体   English

不支持类型 org.apache.spark.sql.types.DataType 的模式

[英]Schema for type org.apache.spark.sql.types.DataType is not supported

I try to create empty df with schema:我尝试使用架构创建空 df:

  val sparkConf = new SparkConf()
    .setAppName("app")
    .setMaster("local")

  val sparkSession = SparkSession
    .builder()
    .config(sparkConf)
    .getOrCreate()

  val sparkContext = sparkSession.sparkContext

  var tmpScheme = StructType(
    StructField("source_id", StringType, true) :: Nil)

var df = conf.SparkConf.sparkSession.createDataFrame(tmpScheme)

and got Schema for type org.apache.spark.sql.types.DataType is not supported...并得到Schema for type org.apache.spark.sql.types.DataType is not supported...

I don't understand why - there is no .DataType even in Imports:我不明白为什么 - 即使在 Imports 中也没有.DataType

import org.apache.spark.sql.types.{BooleanType, IntegerType, StringType, StructField, StructType}

What can be the problem here?这可能是什么问题?

PS: spark version PS:火花版

  "org.apache.spark" %% "spark-sql" % "3.2.2", // spark
  "org.apache.spark" %% "spark-core" % "3.2.2", // spark

If you check the documentation, you can see that the argument fields of StructType is of type Array[StructField] and you are passing StructField .如果您查看文档,您可以看到StructType的参数fieldsArray[StructField]类型并且您正在传递StructField

This means that you should wrap your StructField with Array , for example:这意味着您应该用Array包装您的StructField ,例如:

val simpleSchema = StructType(Array(
  StructField("source_id", StringType, true))
)

Good luck!祝你好运!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 Spark 中执行聚合函数时出错:ArrayType 无法转换为 org.apache.spark.sql.types.StructType - Error while performing aggregate functions in Spark: ArrayType cannot be cast to org.apache.spark.sql.types.StructType 类型不匹配; 找到:所需单位:Array [org.apache.spark.sql.Dataset [org.apache.spark.sql.Row]] - type mismatch; found : Unit required: Array[org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]] 错误:org.apache.spark.sql.AnalysisException:无法推断CSV模式 - ERROR: org.apache.spark.sql.AnalysisException: Unable to infer schema for CSV java.lang.RuntimeException: Unsupported literal type class org.apache.spark.sql.Dataset /Spark - JAVA - java.lang.RuntimeException: Unsupported literal type class org.apache.spark.sql.Dataset /Spark - JAVA org.apache.spark.sql.AnalysisException: - org.apache.spark.sql.AnalysisException: 列不是org.apache.spark.sql.DataFrame的成员 - column is not a member of org.apache.spark.sql.DataFrame org.apache.spark.sql.AnalysisException:无法解决Spark SQL中的“ Column_name”异常 - org.apache.spark.sql.AnalysisException: cannot resolve 'Column_name` Exception in Spark SQL Spark/Scala - 在执行 withColumn 操作时加入结果集给出类型不匹配错误; 发现:org.apache.spark.sql.Column required: Boolean - Spark/Scala - Join resultset giving type mismatch error while performing withColumn operation; found : org.apache.spark.sql.Column required: Boolean Spark:scala.MatchError(类org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema) - Spark: scala.MatchError (of class org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema 如何将MongoDB数据上创建的org.apache.spark.sql.DataFrame中的数据保存回MongoDB? - How to save data from org.apache.spark.sql.DataFrame created on MongoDB data back to MongoDB?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM