简体   繁体   English

如何在 Pyspark 和 Palantir Foundry 中使用多个语句将列的值设置为 0

[英]How do I set value to 0 of column with multiple statements in Pyspark and Palantir Foundry

I'm trying to make a statement that basically says if EDW_ABC.edw_xdpt_act_arrv_lb is NULL then use EDW_ABC.edw_putt_act_arrv_lb and if both are null then set the value to 0. How do I do that?我试图发表一个声明,基本上说如果EDW_ABC.edw_xdpt_act_arrv_lb为 NULL,则使用EDW_ABC.edw_putt_act_arrv_lb ,如果两者都为 null,则将值设置为 0。我该怎么做? I'm trying the below and I know it's not correct.我正在尝试以下操作,但我知道这是不正确的。

EDW_ABC = EDW_ABC.withColumn('act_arrv_abc_lbs', F.when(
(EDW_ABC.edw_xdpt_act_arrv_lb.isNull() == True) & (EDW_ABC.edw_putt_act_arrv_lb.isNull() == True). F.lit(0)\
                                         .otherwise(EDW_ABC.edw_xdpt_act_arrv_lb.isNull()), EDW_ABC.edw_putt_act_arrv_lb)

You don't need to specify a condition in the otherwise , so你并不需要指定的条件otherwise ,这样

EDW_ABC = EDW_ABC.withColumn(
    'act_arrv_abc_lbs',
    F.when(
        EDW_ABC.edw_xdpt_act_arrv_lb.isNull() & EDW_ABC.edw_putt_act_arrv_lb.isNull(), F.lit(0)
    ).otherwise(
        EDW_ABC.edw_putt_act_arrv_lb
    )
)

when and otherwise operate as if & else , so if the first condition in the when isn't satisfied, the otherwise automatically assumes the opposite. whenotherwise操作为ifelse ,因此,如果在第一条件when不被满足,否则自动假设相反。

您将需要使用pyspark.sql.functions.coalesce()函数,该函数返回列表中第一列不返回 NULL 的值。

EDW_ABC = EDW_ABC.withColumn('act_arrv_abc_lbs', F.coalesce(F.col("edw_xdpt_act_arrv_lb"), F.col("edw_putt_act_arrv_lb"), F.lit(0))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在代码工作簿中合并 Palantir Foundry 中的两个数据集? - How do I union two datasets in Palantir Foundry within a code workbook? 如何在 Palantir Foundry 的 Python 转换中传递数据集元数据,如 hash 或时间戳? - How can I pass through dataset metadata, like a hash or timestamp, in a Python Transform in Palantir Foundry? 如何在 palantir-foundry 中导入和使用 Spark-Koalas - How do you import and use Spark-Koalas in palantir-foundry 如何转换 pyspark dataframe 列的值? - How do I convert the value of a pyspark dataframe column? 我的程序经过多个 if-else 语句后如何打印一组语句 - how do i print one set of statements after my program goes through multiple if-else statements 如何过滤 pyspark 中的列? - How do I filter the column in pyspark? 如何在 Pyspark 中的 window(不是 groupBy)中找到 A 列最大值对应的 B 列值? - How do I find the corresponding value of Column B for the max of column A over a window (not groupBy) in Pyspark? PySpark:如何基于其他行值更改行+列的值 - PySpark: How do I change the value of a row+column based on the value of other row values Palantir Foundry 在更新时向外部 API 发送一个数据集并将响应添加到新列 - Palantir Foundry send to the external API a dataset when updating and add the response to a new column 如何为多列在pyspark数据框中的一列中计算每个分类变量的频率? - How do I count frequency of each categorical variable in a column in pyspark dataframe for multiple columns?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM