简体   繁体   English

将Spark架构从Dataframe与T类型进行比较

[英]Compare spark schema from Dataframe to type T

I am trying to add some runtime type checks when writting a Spark Dataframe, basically I want to make sure that the DataFrame schema is compatible with a type T, compatible doesn't mean that it has to be exactly the same. 我正在尝试在编写Spark Dataframe时添加一些运行时类型检查,基本上我想确保DataFrame架构与类型T兼容,兼容并不意味着它必须完全相同。 Here is my code 这是我的代码

def save[T: Encoder](dataframe: DataFrame, url: String): Unit = {
        val encoder = implicitly[Encoder[T]]
        assert(dataframe.schema == encoder.schema, s"Unable to save schemas don't match")

        dataframe.write.parquet(url)
      }

Currently I am checking that the schemas are equals, how could I check that they are compatible with the type T? 目前我正在检查模式是否等于,我怎么能检查它们是否与T类型兼容?

With compatible I mean that if I execute dataframe.as[T] it will work (but I don't want to execute that because it is quite expensive) 兼容我的意思是,如果我执行dataframe.as[T]它将工作(但我不想执行它,因为它是非常昂贵的)

Create an empty dataframe with the same schema and call .as[T] on it. 创建一个具有相同模式的空数据框,并在其上调用.as[T] If it works the schema should be compatible! 如果它工作,架构应该兼容!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM