简体   繁体   中英

Spark Scala Dataframe: How to convert columns into rows?

Are there some efficient way to transpose columns into rows for big DataFrame in Spark Scala?

val inputDF = Seq(("100","A", "10", "B", null),
                  ("101","A", "20", "B", 30)
              ).toDF("ID", "Type1", "Value1", "Type2", "Value2")

I want to transpose it into a Dataframe as below.

val OutDF = Seq(("100","A", "10"),
                ("100","B", "null),
                ("101", "A", "20"),
                ("101", "B", "30")
             ).toDF("ID", "TypeID", "Value")  

The dataframe is big, which contains around 1GB data. I am using spark 2.4.x. Any comments on doing this in an efficient way? Thanks a lot!

You can do a union:

val outputDF = inputDF.select("ID","Type1","Value1")
                      .unionAll(inputDF.select("ID","Type2","Value2"))
                      .toDF("ID","Type","Value")  // change column names

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM