I need to specify a sequence of columns. If I pass two strings, it works fine
val cols = array("predicted1", "predicted2")
but if I pass a sequence or an array, I get an error:
val cols = array(Seq("predicted1", "predicted2"))
Could you please help me? Many thanks!
You have at least two options here:
Using a Seq[String]
:
val columns: Seq[String] = Seq("predicted1", "predicted2") array(columns.head, columns.tail: _*)
Using a Seq[ColumnName]
:
val columns: Seq[ColumnName] = Seq($"predicted1", $"predicted2") array(columns: _*)
Function signature is def array(colName: String, colNames: String*): Column
which means that it takes one string and then one or more strings. If you want to use a sequence, do it like this:
array("predicted1", Seq("predicted2"):_*)
From what I can see in the code , there are a couple of overloaded versions of this function, but neither one takes a Seq
directly. So converting it into varargs as described should be the way to go.
You can use Spark's array form def array(cols: Column*): Column
where the cols
val is defined without using the $
column name notation -- ie when you want to have a Seq[ColumnName]
type specifically, but created using strings. Here is how to solve that...
import org.apache.spark.sql.ColumnName
import sqlContext.implicits._
import org.apache.spark.sql.functions._
val some_states: Seq[String] = Seq("state_AK","state_AL","state_AR","state_AZ")
val some_state_cols: Seq[ColumnName] = some_states.map(s => symbolToColumn(scala.Symbol(s)))
val some_array = array(some_state_cols: _*)
...using Spark's symbolToColumn
method.
or with the ColumnName(s)
constructor directly.
val some_array: Seq[ColumnName] = some_states.map(s => new ColumnName(s))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.