简体   繁体   中英

How to join two special RDDs?

One is

rdd1 : JavaPairRDD<Tuple2<String,String>,Integer> 

another is

rdd2 : JavaPairRDD<String,Integer>

I want to join rdd1 and rdd2 where Tuple2._1 in rdd1 equals to the key in rdd2. For example, (("a","b"),1) and ("a",2) will generate (("a","b"),1,2). When I map rdd1 to:

rdd3 : JavaPairRDD<String, Tuple2<String, Integer>>

and try to use rdd3.join(rdd2) , it arose “can only concatenate tuple (not "str") to tuple”. Is there a solution to join rdd1 and rdd2 and get the results I want?

Map rdd1 to:

JavaPairRDD<String, Tuple2<Tuple2<String,String>,Integer>>

with something like:

x -> new Tuple2(x._1._1, x)

use standard join and map once again to have desired result

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM