简体   繁体   English

从 df 列的列表中过滤期望值

[英]Filter expected value from list in df column

I have a data frame with the following column:我有一个包含以下列的数据框:

raw_col
['a','b','c']
['b']
['a','b']
['c']

I want to return a column with single value based on a conditional statement.我想根据条件语句返回具有单个值的列。 I wrote the following function:我写了以下function:

def filter_func(elements):
  if "a" in elements:
    return "a"
  else:
    return "Other"

When running the function on the column df.withColumn("col", filter_func("raw_col")) I have the following error col should be Columndf.withColumn("col", filter_func("raw_col"))列上运行 function 我有以下错误col should be Column

What's wrong here?这里有什么问题? What should I do?我应该怎么办?

You can use array_contains function:您可以使用array_contains function:

import pyspark.sql.functions as f

df = df.withColumn("col", f.when(f.array_contains("raw_col", f.lit("a")), f.lit("a")).otherwise(f.lit("Other")))

But if you have a complex logic and need necessary use the filter_func , it's needed to create an UDF:但是如果你有一个复杂的逻辑并且需要使用filter_func ,则需要创建一个 UDF:

@f.udf()
def filter_func(elements):
    if "a" in elements:
        return "a"
    else:
        return "Other"

df = df.withColumn("col", filter_func("raw_col"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果 df1 column1 中的值与列表中的值匹配,Pandas 从另一个 df1 column2 在 df2 中创建新列 - Pandas create new column in df2 from another df1 column2 if a value in df1 column1 matches value in a list 如何将列表字典从列值转换为 pandas df 中的列? - How to convert the list dictionary from a column value into column in pandas df? Pandas:针对每个列表项搜索 df 列,从 df 列中弹出匹配的值 - Pandas: search df column against every list item, pop out matching value from df column 按允许的列值组合列表过滤 df - Filter df by list of allowable combinations of column values 如果列值与另一个 DF 列表中的值匹配,则向 DF 添加值 - Add value to DF if column value matches value in list of another DF list(df['column']) 和 df['column'].to_list() 有什么区别? - What are the differences from list(df['column']) and df['column'].to_list()? Pandas:如果df1列的值在df2列的列表中,则加入 - Pandas: Join if value of df1 column is in list of df2 column 如果在列表中找到另一列的值,如何从 df 的一列返回值 - How to return a value from one column of a df if the value of another column is found in a list 根据条件从第一个 df 到另一个 df 的列值 - Column value from first df to another df based on condition 从 df 列中的列表中过滤项目 - filtering items from list in df column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM