简体   繁体   中英

Unpivot in spark-sql/Scala column names are numbers

I have tried the built in stack function described in this post Unpivot in spark-sql/pyspark for Scala, and works fine for each of the columns identified with a code that contains a letter but not in those columns where the code is just a number.

I have a dataframe df that looks like this

I applied as mentioned in the linked answer:

val result = df.select($"Id", expr("stack(3, '00C', 00C, '0R5', 0R5, '234', 234)"))

And the result is this one

What I want is that the value of the row 234 was 0 as it should be .

Because 234 is number & In SQL, If you select any number It will return same number as value, You need to tell compiler 234 is column name not number, to do that you have to use backtick (`) around the number ie `234` .

Check below code.

scala> val df = Seq(("xyz",0,1,0)).toDF("Id","00C","0R5","234")
df: org.apache.spark.sql.DataFrame = [Id: string, 00C: int ... 2 more fields]

scala> df.select($"Id", expr("stack(3, '00C', 00C, '0R5', 0R5, '234',`234`)")).show(false)
+---+----+----+
|Id |col0|col1|
+---+----+----+
|xyz|00C |0   |
|xyz|0R5 |1   |
|xyz|234 |0   |
+---+----+----+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM