I am trying to do something like this, but it gives an error. What is the right way of doing so while still using a variable?
import org.apache.spark.sql._
....
val seq = Seq[Column](new Column("colX"), new Column("colY"), new Column("colZ"))
someDataFrame.orderBy(seq)
I know that one can also use something like orderBy("colX", "colY", "colZ")
, but here I want to use a variable because my order sequence would change on different conditions.
I get an error like this.
error: overloaded method value orderBy with alternatives:
(sortExprs: org.apache.spark.sql.Column*)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] <and>
(sortCol: String,sortCols: String*)org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
cannot be applied to (Seq[org.apache.spark.sql.Column])
Try this, you should convert your Array or List to a sequence of values (an actual sequence not a Seq
)
someDataFrame.orderBy(seq:_*)
Quick test here:
INPUT
df.show
+---+---+
| _1| _2|
+---+---+
| c| 0|
| b| 1|
| a| 0|
+---+---+
val s = Seq(new Column("_1"), new Column("_2"))
df.orderBy(s:_*).show
+---+---+
| _1| _2|
+---+---+
| a| 0|
| b| 1|
| c| 0|
+---+---+
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.