简体   繁体   English

获取Scala / Spark中的RDD类型

[英]Get Type of RDD in Scala/Spark

I am not sure if type is the right word to use here, but let say I have an RDD of the following type 我不确定type是否是这里使用的正确单词,但是假设我有一个以下类型的RDD

RDD[(Long, Array[(Long, Double)])]

Now if I have the RDD, how can i find the type of it (as mentioned above) at runtime ? 现在,如果我有RDD,我怎样才能在运行时找到它的类型(如上所述)?

I basically want to compare two RDDs, at runtime to see if they store the same kind of data (the values it self might be different), is there another way to do it? 我基本上想要比较两个RDD,在运行时看它们是否存储相同类型的数据(它自身的值可能不同),还有另一种方法吗? Moreover, I want to get a cached RDD as an instance of RDD type using the following code 此外,我想使用以下代码将缓存的RDD作为RDD类型的实例

sc.getPersistentRDDs(0).asInstanceOf[RDD[(Long, Array[(Long, Double)])]]

where RDD[(Long, Array[(Long, Double)])] has been found out dynamically at run time based on another RDD of same type. 其中RDD [(Long,Array [(Long,Double)])]在运行时基于另一个相同类型的RDD动态找到。 So is there a way to get this value on runtime from an RDD ? 那么有没有办法从RDD获取运行时的这个值?

You can use Scala's TypeTag s 您可以使用Scala的TypeTag

import scala.reflect.runtime.universe._
def checkEqualParameters [T1, T2](x : T1, y : T2)(implicit type1 : TypeTag[T1], type2 : TypeTag[T2]) = { 
    type1.tpe.typeArgs == type2.tpe.typeArgs 
}

And then compare 然后比较

checkEqualParameters (rdd1, rdd2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM