I use Spark 2.1.1.
I started off with the following:
import org.apache.spark.sql.types._
val mySchema = StructType(
StructField("id", IntegerType, true),
StructField("code", StringType, false),
StructField("value", DecimalType, false))
val myDS = Seq((1,"000010", 1.0), (2, "000020", 2.0)).as[mySchema]
Here I saw that mySchema was not a type and after looking at Encoders.scala
I could see I needed to pass a subtype of Product here via
def product[T <: Product : TypeTag]: Encoder[T] = ExpressionEncoder()
So after seeing that the colon operator is just syntactical sugar for an implicit parameter from What are Scala context and view bounds? , I can see that there should be an implicit TypeTag[T] available but I don't understand though how TypeTag[T] is implicit from looking at SQLImplicits.scala
.
/**
* @since 1.6.1
* @deprecated use [[newSequenceEncoder]]
*/
def newProductSeqEncoder[A <: Product : TypeTag]: Encoder[Seq[A]] = ExpressionEncoder()
Even though it's deprecated, when I look at
/** @since 2.2.0 */
implicit def newSequenceEncoder[T <: Seq[_] : TypeTag]: Encoder[T] = ExpressionEncoder()
I still wonder where is there a TypeTag[T] implicitly declared?
TypeTag
is a type class that will implicitly load an instance for any type you try to summon. This is independent from Spark or SQLImplicits
, for example you can try this
def getMyTypeTag[T : TypeTag]: TypeTag[T] = implicitly[TypeTag[T]]
On the other hand a spark sql Encoder
can be built by spark as soon as you import the implicits defined in SqlImplicits
, if you take a look to LowPrioritySQLImplicits
you can see that you need the TypeTag
to create the Encoder
for Product
(case classes), that's why you need to load the TypeTag
in the implicit context
trait LowPrioritySQLImplicits {
/** @since 1.6.0 */
implicit def newProductEncoder[T <: Product : TypeTag]: Encoder[T] = Encoders.product[T]
}
The TypeTag can be summoned only if the code from where you are trying to summon the Encoder is not generic or the TypeTag is already in the context. For Example
def loadEncoder(): Encoder[MyType] ={
import spark.implicits._
Encoder[MyType] // The type is here so it will work
}
on the other hand
loadEncoder[MyType]
def loadEncoder[T](): Encoder[T] ={
import spark.implicits._
Encoder[T] // The type info is not here so it wont work
}
and
loadEncoder[MyType]
def loadEncoder[T: TypeTag](): Encoder[T] ={
import spark.implicits._
Encoder[T] // The type info is not here but the TypeTag is so it will work
}
Ok I thought it was a Spark thing, but there is an import statement at the top of the page
import scala.reflect.runtime.universe.TypeTag
When I look at the API page http://www.scala-lang.org/api/2.11.6/scala-reflect/index.html#scala.reflect.api.TypeTags I can see it's being handled here.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.