Say I have two Spark RDDs with the following values
x = [(1, 3), (2, 4)]
and
y = [(3, 5), (4, 7)]
and I want to have
z = [(1, 3), (2, 4), (3, 5), (4, 7)]
How can I achieve this. I know you can use outerJoin followed by map to achieve this, but is there a more direct way for this.
rdd.union(otherRDD)
为您提供问题中所期望的两个rdds的并集
x.union(y)
You can just use the +
operator. In the context of lists, this is a concatenate operation.
>>> x = [(1, 3), (2, 4)]
>>> y = [(3, 5), (4, 7)]
>>> z = x + y
>>> z
[(1, 3), (2, 4), (3, 5), (4, 7)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.