简体   繁体   English

spark createdataframe 无法推断架构 - 默认数据类型?

[英]spark createdataframe cannot infer schema - default data types?

Creating a spark dataframe in databricks using createdataframe results in error: 'Some of types cannot be determined after inferring'使用 createdataframe 在 databricks 中创建 spark 数据帧会导致错误:“推断后无法确定某些类型”

I know I can specify the schema but that doesn't help if I'm creating the dataframe each time with source data from an API and they decide to restructure it.我知道我可以指定模式,但是如果我每次都使用来自 API 的源数据创建数据框并且他们决定对其进行重组,那将无济于事。

Instead I'd like to tell spark to use 'string' for any column where a data type cannot be inferred.相反,我想告诉 spark 对无法推断数据类型的任何列使用“字符串”。

Is this possible?这可能吗?

Thanks谢谢

This can be easily handled with schema evaluation with delta format.这可以通过使用delta格式的模式评估轻松处理。 Quick ref: https://databricks.com/blog/2019/09/24/diving-into-delta-lake-schema-enforcement-evolution.html快速参考: https : //databricks.com/blog/2019/09/24/diving-into-delta-lake-schema-enforcement-evolution.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spark SQL-createDataFrame错误的结构模式 - Spark SQL - createDataFrame wrong struct schema Spark - 以编程方式使用不同的数据类型创建模式 - Spark - creating schema programmatically with different data types 在Scala Spark中按架构更改Dataframe的数据类型 - Change Data Types for Dataframe by Schema in Scala Spark 大集合的mongo spark推断架构 - mongo spark infer schema for large collections 从 rdd 推断架构到 Spark Scala 中的 Dataframe - Infer Schema from rdd to Dataframe in Spark Scala Spark是否可保存从数据帧推断架构 - Does Spark saveastable infer schema from dataframe java.lang.RuntimeException:无法从空结果推断架构类型,请使用 loadDataFrame(schema: (String,String)*) - java.lang.RuntimeException: Cannot infer schema-types from empty result, please use loadDataFrame(schema: (String,String)*) Spark SQL是否提供API来解析SQL语句和相应的DDL并推断选择列表的数据类型? - Does Spark SQL provide an API to parse a SQL statement and corresponding DDL and infer data types of the select list? 在Spark DataFrame中加载CSV文件数据时使用推断模式与显式传递模式时的性能开销 - Performance overhead while using infer schema vs explicitly passing schema while loading CSV file data in spark dataframe Spark createDataFrame(df.rdd, df.schema) 与 checkPoint 用于打破世系 - Spark createDataFrame(df.rdd, df.schema) vs checkPoint for breaking lineage
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM