I have the following situation: I have a dataframe with an 'id' and 'array' as the schema. Now I want to get for each array, all lists of pairs with the corresponding id and save it again in a dataframe. So for example:
This is the original dataframe:
+---+----------+
| id|candidates|
+---+----------+
| 1| [2, 3]|
| 2| [3]|
+---+----------+
And that is how it have to look like after the computation:
+---+---+
|id1|id2|
+---+---+
| 1| 2|
| 1| 3|
| 2| 3|
+---+---+
Maybe someone has an idea for this problem?
Ok, thanks @cheseaux I found the answer! There is the simply explode_outer function:
candidatesDF.withColumn("candidates", explode_outer($"candidates")).show
只需explode
数组列。
candidatesDF.withColumn("id2", explode('candidates))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.