I want to achieve something like this:
x.joinWith(y, x(id) === y(fid), "left_outer")
.joinWith(z, x(id) === z(fid))
.map(case {(x, y, z) => combineXYZ(x, y, z)})
When you use joinWith
, What you get is a new Dataset of Tuple2 : (x, y)
. So the column names are _1
and _2
.
So when you do your second join, you need to reference a column name from the tuple, not from one of the source dataset. Like that :
x.joinWith(y, x(id) === y(fid), "left_outer").joinWith(z, $"_1.id" === z(fid))
Now, what you get is a tuple2 where first element is also a tuple : ((x, y), z)
. So you must do your map like :
.map(case {((x, y), z) => combineXYZ(x, y, z)})
This should work. Note that If you don't want to use $"_1.id
, which is totally understandable, you can do a map after your first join, in order to create a new object, other than a tuple2, in order to get the correct column name.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.