将列表中的元素与包含列表的列相匹配。如果找到单个元素，则返回整行

Question

If there is a column that holds lists and if a single element matches from our list, Return entire row.如果有一列包含列表，并且如果单个元素与我们的列表匹配，则返回整行。 For example we have a data frame:例如我们有一个数据框：

index             x
0                [apple, orange, strawberry]
1                [blueberry, pear, watermelon]
2                [apple, banana, strawberry]
3                [apple]
4                [strawberry]

And we have our list,
a = [apple, strawberry]
# I am trying to return index 0,2,3 and 4. But currently I am only able to return index 3 and 4
new_DF = df[df['x'].isin(a)]

# This function is getting the user input for list 'a'. 
# This is for reference of what I am actually trying to do. 

def filter_Industries():
    num_of_industries = int(input('How many industries would you like to filter by?\n'))
    list_industries = []  
    i = 0
    for i in range(num_of_industries):
        industry = input("Enter the industry:\n")
        i += 1
        list_industries.append(industry)

    return list_industries

a = filter_Industries()
# This is where I am trying to match the elements from the user's list to the data set.
new_DF = df[df['x'].isin(a)]

Answer 1

You can use DataFrame.apply(function) method.您可以使用DataFrame.apply(function)方法。 In this case we check all rows whether have a common with "a" list.Let's create function:在这种情况下，我们检查所有行是否与“a”列表有共同点。让我们创建 function：

a = ["apple", "strawberry"]
a_set = set(a)
def hasCommon(x):
    return len(set(x) & a_set) > 0

So if we have a common element it will return True.因此，如果我们有一个公共元素，它将返回 True。 Let's create dummy data让我们创建虚拟数据

import pandas as pd
data = {
  "calories": [["apple", "orange", "strawberry"], ["blueberry", "pear", "watermelon"], ["strawberry", "pear", "watermelon"]],
  "duration": [50, 40,120]
}

#load data into a DataFrame object:
df = pd.DataFrame(data)

print(df)

And we can use like that:我们可以这样使用：

df[df["calories"].apply(hasCommon)]

Answer 2

When you using isin(a) on the values of the 0, 1 and 2 index, the function try to compare a list (eg, [apple, orange, strawberry]) with the a list.当您对 0、1 和 2 索引的值使用 isin(a) 时，function 会尝试将列表（例如，[apple, orange, strawberry]）与a列表进行比较。 The function worked with the 3 and 4 elements because it compares a single element with a whole list. function 使用 3 和 4 元素，因为它将单个元素与整个列表进行比较。

I suggest to intersect the a list and the dataframe after converted that two a set, with this code:我建议将a列表和 dataframe 转换为两个集合后相交，使用以下代码：

for i in range(len(df)):
 if set(a) & set(df['x'][i]) != set():
  new_DF.append(df['x'][i])

It will append to new_DF just the lines that isn't returned void sets.它将 append 发送给 new_DF 只是未返回的行无效集。

将列表中的元素与包含列表的列相匹配。如果找到单个元素，则返回整行

问题描述

2 个解决方案

解决方案1
1 2023-01-17 15:23:06

解决方案2
0 2023-01-17 15:28:55

将列表中的元素与包含列表的列相匹配。 如果找到单个元素，则返回整行

问题描述

2 个解决方案

解决方案1 1 2023-01-17 15:23:06

解决方案2 0 2023-01-17 15:28:55

将列表中的元素与包含列表的列相匹配。如果找到单个元素，则返回整行

解决方案1
1 2023-01-17 15:23:06

解决方案2
0 2023-01-17 15:28:55