简体   繁体   中英

How to use a string as a expression/argument in Scala/Spark?

I am trying to add lot more columns to a dataframe using existing columns in a dataframe. However, Scala dataframes are immutable making it difficult to do it iteratively. So, I came up with a for loop which outputs the string (see a sample code below, which stores the entire statement I can use on the spark dataframe).

val train_df = sqlContext.sql("select * from someTable")

/*for loop output is similar to the Str variable as below*/
var Str = ".withColumn(\"newCol1\",$\"col1\").withColumn(\"newCol2\",$\"col2\").withColumn(\"newCol3\",$\"col3\")"

/* Below is what I am trying to do" */
val train_df_new = train_df.Str

So, how can I save the expression/argument in a string and reuse it in scala/spark to add all those new columns at once to a new dataframe?

Use a foldLeft instead. Here a Map with the old and new column names are used:

val m = Map(("col1", "newCol1"), ("col2", "newCol2"), ("col3", "newCol3"))
val train_df_new = m.keys.foldLeft(train_df)((df, c) => df.withColumnRenamed(c, m(c)))

Instead of withColumnRenamed any iterative function on the dataframe can be used here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM