简体   繁体   English

Spark:如何将RDD [(Long,Iterable [String])]转换为RDD [(Long,String)]?

[英]Spark: How to convert RDD[(Long, Iterable[String])] to RDD[(Long, String)]?

How to convert this RDD[(Long, Iterable[String])] to... 如何将此RDD [(Long,Iterable [String])]转换为...

(852403,Set(PT0000094043, PT0000097083, PT0000036162))
(357331,Set(PT0000068829, PT0000094042, PT0000066859))

RDD[(Long, String)] like this ? RDD [(Long,String)]像这样吗?

(852403, PT0000094043)
(852403, PT0000097083)
(852403, PT0000036162)
(357331, PT0000068829)
(357331, PT0000094042)
(357331, PT0000066859)

Try flatMapValues : 尝试flatMapValues

rdd.flatMapValues(identity)

or flatMap : flatMap

rdd.flatMap{ case (k, vs) => vs.map(v => (k, v)) }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM