简体   繁体   English

带有Play-json验证的Apache Spark Scala

[英]Apache Spark Scala with Play-json Validation

java.lang.UnsupportedOperationException: Schema for type [trait object] is not supported java.lang.UnsupportedOperationException:不支持类型[trait object]的模式

trait Container {
  def aa: String
  def bb: Int
}

case class First(aa: String, bb: Int) extends Container
case class Second(aa: String, bb: Int) extends Container

implicit val aaContainerFormat: Format[First] = Json.format[First]

implicit val bbContainerFormat: Format[Second] = Json.format[Second]

implicit def nodeContainerReads: Reads[Container] =
  try {
    Json.format[First].map(x => x: Container) or
    Json.format[Second].map(x => x: Container)
  } catch {
    case e: Exception => Reads {
      case _ => JsError(JsonValidationError("Cannot De-serialize value."))
    }
  }

implicit def nodeContainerWrites = new Writes[Container] {
  override def writes(node: Container): JsValue = node match {
    case a: First => Json.toJson(a)
    case b: Second => Json.toJson(b)
    case _ => Json.obj("error" -> "wrong Json")
  }
}

// Example Usage....
val spark: SparkSession = SparkSession.builder.appName("Unit Test").getOrCreate()
val js: Container = First("unit", "test")

spark.createDataFrame(Seq(js))

I expect the output of Datasets of [Container Object] but the actual output is java.lang.UnsupportedOperationException: Schema for type Container is not supported. 我期望[Container Object]的数据集的输出,但实际输出是java.lang.UnsupportedOperationException:不支持Container类型的架构。

Spark does not use typeclasses from Play JSON to convert Scala types into Spark SQL types. Spark不使用Play JSON中的类型类将Scala类型转换为Spark SQL类型。 Instead you need to look at Spark Encoders which form the basis of conversions for Scala types to Spark types. 相反,您需要查看Spark编码器 ,它们构成了将Scala类型转换为Spark类型的基础。 If you have the Spark Session in scope, you can use import sparkSession.implicits._ so it will automatically create encoders for your case classes. 如果您将Spark Session包含在范围内,则可以使用import sparkSession.implicits._这样它将自动为您的案例类创建编码器。 I believe that Spark doesn't support sum-types out of the box so you would need to implement your own Encoder to somehow model that in Spark in an ad-hoc fashion. 我相信Spark不支持开箱即用的求和类型,因此您需要实现自己的Encoder以某种方式以临时方式在Spark中进行建模。 Please read here for more information if you want to encode sum types in Spark 如果要在Spark中编码求和类型,请阅读此处以获取更多信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM