I have a dataframe like this :
userId someString varA varB
1 "example1" 0,2,5 1,2,9
2 "example2" 1,20,5 9,null,6
i want to convert the data into VarA and varB to an array of String
userId someString varA varB
1 "example1" [0,2,5] [1,2,9]
2 "example2" [1,20,5] [9,null,6]
Its fairly Simple. you can use sql split function.
import org.apache.spark.sql.functions.split
df.withColumn("varA", split($"varA",",")).withColumn("varB", split($"varB",",")).show()
Output
+------+----------+----------+------------+
|userId|someString| varA| varB|
+------+----------+----------+------------+
| 1| example1| [0, 2, 5]| [1, 2, 9]|
| 2| example2|[1, 20, 5]|[9, null, 6]|
+------+----------+----------+------------+
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.