简体   繁体   中英

Spark: How to convert RDD[(Long, Iterable[String])] to RDD[(Long, String)]?

How to convert this RDD[(Long, Iterable[String])] to...

(852403,Set(PT0000094043, PT0000097083, PT0000036162))
(357331,Set(PT0000068829, PT0000094042, PT0000066859))

RDD[(Long, String)] like this ?

(852403, PT0000094043)
(852403, PT0000097083)
(852403, PT0000036162)
(357331, PT0000068829)
(357331, PT0000094042)
(357331, PT0000066859)

Try flatMapValues :

rdd.flatMapValues(identity)

or flatMap :

rdd.flatMap{ case (k, vs) => vs.map(v => (k, v)) }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM