[英]Spark Scala case when with multiple conditions
I'm trying to do a case on a DF I have but I'm getting an error.我正在尝试对我拥有的 DF 进行处理,但出现错误。 I want to implement this with built in spark functions - withcolumn, when, otherwise:我想用内置的火花函数来实现这个 - withcolumn,when,否则:
CASE WHEN vehicle="BMW"
AND MODEL IN ("2020","2019","2018","2017")
AND value> 100000 THEN 1
ELSE 0 END AS NEW_COLUMN
Currently I have this目前我有这个
DF.withColumn(NEW_COLUMN, when(col(vehicle) === "BMW"
and col(model) isin(listOfYears:_*)
and col(value) > 100000, 1).otherwise(0))
But I'm getting an error due to data type mismatch, (boolean and string)... I understand my condition returns booleans and strings, which is causing the error.但是由于数据类型不匹配(布尔值和字符串),我收到了一个错误……我知道我的条件返回布尔值和字符串,这是导致错误的原因。 What's the correct syntax for executing a case like that one?执行这样的案例的正确语法是什么? also, I was using && instead of and but the third && was giving me a "cannot resolve symbol &&"另外,我使用 && 而不是and但第三个 && 给了我一个“无法解析符号&&”
Thanks for the help!谢谢您的帮助!
I think && is correct - with the built-in spark functions, all of the expressions are of type Column , checking the API it looks like &&
is correct and should work fine.我认为 && 是正确的 - 使用内置的 spark 函数,所有表达式都是Column类型,检查 API 看起来&&
是正确的并且应该可以正常工作。 Could it be as simple as an order-of-operations issue, where you need parentheses around each of the boolean conditions?是否可以像操作顺序问题一样简单,您需要在每个布尔条件周围加上括号? The function / "operator" isin
would have a lower precedence than &&
, which might trip things up.函数/“运算符” isin
优先级低于&&
,这可能会导致问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.