简体   繁体   English

在Scala,Spark中连接两个(成对)RDD

[英]Joining two (paired) RDDs in Scala, Spark

I am trying to join two paired RDDs, as per the answer provided here 根据此处提供的答案,我正在尝试加入两个配对的RDD

Joining two RDD[String] -Spark Scala 连接两个RDD [String] -Spark Scala

I am getting an error 我收到一个错误

error: value leftOuterJoin is not a member of org.apache.spark.rdd.RDD[ 错误:值leftOuterJoin不是org.apache.spark.rdd.RDD的成员

The code snippet is as below. 代码段如下。

val pairRDDTransactions = parsedTransaction.map 
     {
              case ( field3, field4, field5, field6, field7,
           field1, field2, udfChar1, udfChar2, udfChar3) => 
             ((field1, field2), field3, field4, field5, 
                 field6, field7, udfChar1, udfChar2, udfChar3)   
     }      



val pairRDDAccounts  = parsedAccounts.map
     {
       case (field8, field1, field2, field9, field10 ) =>
         ((field1, field2), field8, field9, field10)

     }  



val transactionAddrJoin = pairRDDTransactions.leftOuterJoin(pairRDDAccounts).map {       
       case ((field1, field2), (field3, field4, field5, field6,
           field7, udfChar1, udfChar2, udfChar3, field8, field9, field10)) =>
             (field1, field2, field3, field4, field5, field6,
           field7, udfChar1, udfChar2, udfChar3, field8, field9, field10)           

 }

In this case, field1 and field 2 are my keys, on which I want to perform join. 在这种情况下,field1和field 2是我要在其上执行联接的键。

Joins are defined for RDD[(K, V)] ( RDD of Tuple2 objects. In you case however, there arbitrary tuples ( Tuple4[_, _, _, _] and Tuple8[_, _, _, _, _, _, _, _] ) - this just cannot work. RDD[(K, V)]Tuple2对象的RDD定义了联接。但是,在您的情况下,存在任意元组( Tuple4[_, _, _, _]Tuple8[_, _, _, _, _, _, _, _] )-这是行不通的。

You should 你应该

... => 
  ((field1, field2), 
     (field3, field4, field5, field6, field7, udfChar1, udfChar2, udfChar3)   

and

... =>
  ((field1, field2), (field8, field9, field10))

respectively. 分别。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM