简体   繁体   English

org.apache.spark.sql.types.DataTypeException:不支持的dataType:IntegerType

[英]org.apache.spark.sql.types.DataTypeException: Unsupported dataType: IntegerType

I am new to Spark and Scala i am stuck on this exception, I am trying to add some extra fields, ie StructField to an existing StructType retrieved from Data Frame for a column using Spark SQL and gettting below exception. 我是Spark和Scala的新手,我对此异常感到困惑,我试图将一些额外字段(即StructField)添加到使用Spark SQL从Data Frame检索到的现有StructType中,以使用Spark SQL并在异常下方进行getting。

code snippet: 代码段:

val dfStruct:StructType=parquetDf.select("columnname").schema
dfStruct.add("newField","IntegerType",true)

Exception in thread "main" 线程“主”中的异常

 org.apache.spark.sql.types.DataTypeException: Unsupported dataType: IntegerType. If you have a struct and a field name of it has any special characters, please use backticks (`) to quote that field name, e.g. `x+y`. Please note that backtick itself is not supported in a field name.
    at org.apache.spark.sql.types.DataTypeParser$class.toDataType(DataTypeParser.scala:95)
    at org.apache.spark.sql.types.DataTypeParser$$anon$1.toDataType(DataTypeParser.scala:107)
    at org.apache.spark.sql.types.DataTypeParser$.parse(DataTypeParser.scala:111)

I can see there some open issues running on jira related to this exception but not able to understand much. 我可以看到在jira上运行了一些与此异常相关的未解决问题,但了解得不多。 I am using Spark 1.5.1 version 我正在使用Spark 1.5.1版本

https://mail-archives.apache.org/mod_mbox/spark-issues/201508.mbox/%3CJIRA.12852533.1438855066000.143133.1440397426473@Atlassian.JIRA%3E https://mail-archives.apache.org/mod_mbox/spark-issues/201508.mbox/%3CJIRA.12852533.1438855066000.143133.1440397426473@Atlassian.JIRA%3E

https://mail-archives.apache.org/mod_mbox/spark-issues/201508.mbox/%3CJIRA.12852533.1438855066000.143133.1440397426473@Atlassian.JIRA%3E https://mail-archives.apache.org/mod_mbox/spark-issues/201508.mbox/%3CJIRA.12852533.1438855066000.143133.1440397426473@Atlassian.JIRA%3E

https://issues.apache.org/jira/browse/SPARK-9685 https://issues.apache.org/jira/browse/SPARK-9685

When you use StructType.add with a following signature: 当您将StructType.add与以下签名一起使用时:

add(name: String, dataType: String, nullable: Boolean)

dataType string should correspond to either .simpleString or .typeName . dataType字符串应对应于.simpleString.typeName For IntegerType it is either int : 对于IntegerType它可以是int

import org.apache.spark.sql.types._

IntegerType.simpleString
// String = int

or integer : integer

IntegerType.typeName
// String = integer

so what you need is something like this: 所以您需要的是这样的:

val schema = StructType(Nil)

schema.add("foo", "int", true)
// org.apache.spark.sql.types.StructType = 
//   StructType(StructField(foo,IntegerType,true))

or 要么

schema.add("foo", "integer", true)
// org.apache.spark.sql.types.StructType = 
//   StructType(StructField(foo,IntegerType,true))

If you want to pass IntegerType it has to be DataType not String : 如果要传递IntegerType ,则必须是DataType而不是String

schema.add("foo", IntegerType, true)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 不支持类型 org.apache.spark.sql.types.DataType 的模式 - Schema for type org.apache.spark.sql.types.DataType is not supported IntelliJ:线程“ main”中的异常java.lang.NoClassDefFoundError:org / apache / spark / sql / types / DataType - IntelliJ: Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/spark/sql/types/DataType 如何修复此Scala jar错误“线程“ main”中的异常java.lang.NoClassDefFoundError:org / apache / spark / sql / types / DataType” - How to fix this scala jar error “Exception in thread ”main“ java.lang.NoClassDefFoundError: org/apache/spark/sql/types/DataType” Spark SQL不支持的数据类型TimestampType - Spark SQL Unsupported datatype TimestampType 没有为org.apache.spark.sql.types.TimestampType定义隐式排序 - No implicit Ordering defined for org.apache.spark.sql.types.TimestampType 哪个罐子具有org.apache.spark.sql.types? - Which jar has org.apache.spark.sql.types? Spark SQL的Scala API - TimestampType - 找不到org.apache.spark.sql.types.TimestampType的编码器 - Spark SQL's Scala API - TimestampType - No Encoder found for org.apache.spark.sql.types.TimestampType array_intersect 的 Scala 错误:“不支持的文字类型类 org.apache.spark.sql.Dataset” - Scala Error for array_intersect: "Unsupported literal type class org.apache.spark.sql.Dataset" 为什么org.apache.spark.sql.types.DecimalType的最大精度值在SparkSQL中是38? - why org.apache.spark.sql.types.DecimalType's max precision value is 38 in SparkSQL? 找不到org.apache.spark.sql.types.SQLUserDefinedType类-继续存根 - Class org.apache.spark.sql.types.SQLUserDefinedType not found - continuing with a stub
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM