Need to convert single row into multiple columns. Did below things.
val list = List("a", "b", "c", "d")
import spark.implicits._
val df = list.toDF("id")
df.show()
import spark.implicits._
val transpose = list.zipWithIndex.map {
case (_, index) => col("data").getItem(index).as(s"col_${index}")
}
df.select(collect_list($"id").as("data")).select(transpose: _*).show()
output:
+-----+-----+-----+-----+
|col_0|col_1|col_2|col_3|
+-----+-----+-----+-----+
| a| b| c| d|
+-----+-----+-----+-----+
Did something and convert it. But problem with transpose function, it is relaying original data (list). If we do any filter in df, it will always shows 4 column as original list have 4. How can i shortout this list.
df.filter($"id" =!="a" ).select(collect_list($"id").as("data")).select(transpose: _*).show()\
if apply filter condition and show command
+-----+-----+-----+-----+
|col_0|col_1|col_2|col_3|
+-----+-----+-----+-----+
| b| c| d| null|
+-----+-----+-----+-----+
which is wrong and should show 3 columns not 4 columns.
you could do it with pivot:
val df = List("a", "b", "c", "d").toDF("id")
val dfFiltered = df.filter($"id"=!="a")
dfFiltered
.groupBy().pivot($"id").agg(first($"id"))
.toDF((0 until dfFiltered.count().toInt).map(i => s"col_$i"):_*)
.show()
+-----+-----+-----+
|col_0|col_1|col_2|
+-----+-----+-----+
| b| c| d|
+-----+-----+-----
Did some trick with trimming the columns based on df row count. Let me know if it helps
import org.apache.spark.sql.functions._
object TransposeV2 {
def main(args: Array[String]): Unit = {
val spark = Constant.getSparkSess
val list = List("a", "b", "c", "d")
import spark.implicits._
val df = list.toDF("id")
df.show()
import spark.implicits._
val transpose = list.zipWithIndex.map {
case (_, index) => {
col("data").getItem(index).as(s"col_${index}")
}
}
df.select(collect_list($"id").as("data")).select(transpose: _*).show()
val dfInterim = df.filter($"id" =!="a" )
val finalElements : Int = dfInterim.count().toInt
dfInterim.select(collect_list($"id").as("data")).select(transpose.take(finalElements): _*).show()
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.