How to create a schema for spark from a trait? Considering a trait:
trait A{
val name:String
val size:String
}
As:
Encoders.product[A].schema
gives:
Error:type arguments do not conform to method product's type parameter bounds [T <: Product]
Also the number of fields will be more then the limit of case class parameters > 200
Case class do supports more than 22 columns, try creating outside all other class/object. If your need is to create a dataframe schema with large number of fields, this should work.
val schema: StructType = StructType(
Array(
StructField(name = "name", StringType),
StructField(name = "size", StringType)
)
)
val data = Seq(Row("Ramanan","29"))
spark.createDataFrame(spark.sparkContext.parallelize(data),schema).show()
I cannot give you all the details why this is not working but I am proposing a slightly alternative solution that we frequently use in our Scala Spark projects.
The signature of Encoders.product
looks like
product[T <: scala.Product](implicit evidence$5 : scala.reflect.runtime.universe.TypeTag[T])
which means tt expects a class that extends Product
trait and an implicit TypeTag.
Instead of a trait, you could create a case class
as case classes are extending Product
(and Serializable
) automatically.
In order to get a schema you could do:
case class A (
val name: String,
val size: String
)
def createSchema[T <: Product]()(implicit tag: scala.reflect.runtime.universe.TypeTag[T]) = Encoders.product[T].schema
val schema = createSchema[A]()
schema.printTreeString()
/*
root
|-- name: string (nullable = true)
|-- size: string (nullable = true)
*/
As said in the beginning, I can't explain all the details, just provide a working solution and hoping it fit your needs.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.