Scala / Spark: How to pass this parameter to .select statement

Question

I have a way to get a subset of a dataframe which works:

This works
val subset_cols = {joinCols :+ col}
val df1_subset = df1.select(subset_cols.head, subset_cols.tail: _*)

This doesn't work. The code compiles but I get a run time error.

val subset_cols = {joinCols :+ col}
val df1_subset = df1.select(subset_cols.deep.mkString(","))

Error:

Exception in thread "main" org.apache.spark.sql.AnalysisException: 
cannot resolve '`first_name,last_name,rank_dr`' given input columns: 
[model, first_name, service_date, rank_dr, id, purchase_date, 
dealer_id, purchase_price, age, loyalty_score, vin_num, last_name, color];;

'Project ['first_name,last_name,rank_dr]

I'm trying to pass the subset_cols to the .select method but it seems I'm missing some kind of formatting.

Answer 1

what you do is :

df1.select("first_name,last_name,rank_dr")

Spark try to find a column named "first_name,last_name,rank_dr" which does not exist

try :

val df1_subset = df1.selectExpr(subset_cols: _*)

Scala / Spark: How to pass this parameter to .select statement

Question

1 answers

solution1
1 ACCPTED 2019-01-10 15:15:19

Scala / Spark: How to pass this parameter to .select statement

Question

1 answers

solution1 1 ACCPTED 2019-01-10 15:15:19

solution1
1 ACCPTED 2019-01-10 15:15:19