简体   繁体   中英

Global condition on multiple withColumn + when instruction on Spark dataframe

Consider this df

+----+------+
|cond|chaine|
+----+------+
|   0|   TF1|
|   1|   TF1|
|   1|   TNT|
+----+------+

I would like to apply this withColumn instruction but only on rows having cond == 1 :

df.withColumn("New", when($"chaine" === "TF1", "YES!"))
  .withColumn("New2", when($"chaine" === "TF1", "YES2!"))
  .withColumn("New3", when($"chaine" === "TF1", "YES3!"))
  .withColumn("New4", when($"chaine" === "TF1", "YES4!"))

I can't use .filter because I still want to have rows with cond =!= 1 in output.

I can do it by adding my condition inside every where in code:

df.withColumn("New", when($"chaine" === "TF1" AND $"cond" === 1, "YES!"))
  .withColumn("New2", when($"chaine" === "TF1" AND $"cond" === 1, "YES2!"))
  .withColumn("New3", when($"chaine" === "TF1" AND $"cond" === 1, "YES3!"))
  .withColumn("New4", when($"chaine" === "TF1" AND $"cond" === 1, "YES4!"))

but the problem is that I have a lot of new columns and I want a better solution (like a global confition?)

Thank you.

Some simple syntactic ideas:

def whenCondIs(n: Int)(condition: Column, value: Any): Column =
  when(condition && $"cond" === n, value)

def whenOne(condition: Column, value: Any): Column  = 
  whenCondIs(1)(condition, value)

and then:

df.withColumn("New", whenOne($"chaine" === "TF1", "YES2!"))
  .withColumn("New2", whenOne($"chaine" === "TF1", "YES2!"))

You can have the mapping between conditions and the new columns to create, in a list and use foldLeft to add them in into your dataframe. Something like this:

val newCols = Seq(
  ("New", "chaine='TF1'", "YES!"),
  ("New2", "chaine='TF1'", "YES2!"),
  ("New3", "chaine='TF1'", "YES3!"),
  ("New4", "chaine='TF1'", "YES4!")
)

val df1 = newCols.foldLeft(df)((acc, x) =>
  acc.withColumn(x._1, when(expr(x._2) && col("cond")===1, lit(x._3)))
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM