Are there some efficient way to transpose columns into rows for big DataFrame in Spark Scala?
val inputDF = Seq(("100","A", "10", "B", null),
("101","A", "20", "B", 30)
).toDF("ID", "Type1", "Value1", "Type2", "Value2")
I want to transpose it into a Dataframe as below.
val OutDF = Seq(("100","A", "10"),
("100","B", "null),
("101", "A", "20"),
("101", "B", "30")
).toDF("ID", "TypeID", "Value")
The dataframe is big, which contains around 1GB data. I am using spark 2.4.x. Any comments on doing this in an efficient way? Thanks a lot!
You can do a union:
val outputDF = inputDF.select("ID","Type1","Value1")
.unionAll(inputDF.select("ID","Type2","Value2"))
.toDF("ID","Type","Value") // change column names
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.