在dataframe中查找列表中的元素

Question

I have a dataframe "df1": 我有一个数据帧“df1”：

adj           response

beautiful    ["She's a beautiful girl/woman, and also a good teacher."]
good         ["She's a beautiful girl/woman, and also a good teacher."]
hideous      ["This city is hideous, let's move to the countryside."]

And here's the object list: 这是对象列表：

object=["girl","teacher","city","countryside","woman"]

Code: 码：

df1['response_split']=df1['response'].str.split(",")

After I split it, the dataframe will be like this: 拆分后，数据框将如下所示：

adj           response_split

beautiful    ["She's a beautiful girl/woman", " and also a good teacher."]
good         ["She's a beautiful girl/woman", " and also a good teacher."]
hideous      ["This city is hideous", " let's move to the countryside."]

I want to add another column "response_object", if they find the adj in response, they find its object from list object: expected result 我想添加另一列“response_object”，如果他们在响应中找到adj，他们会从列表对象中找到它的对象： 预期结果

adj           response_split                                               response_object

beautiful    ["She's a beautiful girl/woman", " and also a good teacher."]        girl
beautiful    ["She's a beautiful girl/woman", " and also a good teacher."]        woman
good         ["She's a beautiful girl/woman", " and also a good teacher."]        teacher
hideous      ["This city is hideous", " let's move to the countryside."]          city

code: 码：

for i in df1['response_split']:
    if df1['adj'] in i:
        if any(x in i and x in object):
            match = list(filter(lambda x: x in i, object))
            df1['response_object']=match

It prints ValueError 它打印出ValueError

Answer 1

First object is valid python builtins (code word), so better dont use it for variable, here is changed to L : 第一个object是有效的python builtins （代码字），所以最好不要将它用于变量，这里改为L ：

L=["girl","teacher","city","countryside","woman"]

Then zip splitted column with adj , loop by tuples, loop by values in L and match if both match with in and and : 然后zip分裂列与adj ，循环由元组，循环L的值并匹配，如果两者匹配in和and ：

df1['response_split']=df1['response'].str.split(",")
L1 = [(a, b, o) for a, b in zip(df1['adj'], df1['response_split']) 
                for r in b 
                for o in L 
                if (o in r) and (a in r)]

What should be rewrite to loops: 什么应该重写循环：

df1['response_split']=df1['response'].str.split(",")

L1 = []
for a, b in zip(df1['adj'], df1['response_split']):
    for r in b:
        for o in L:
            if (o in r) and (a in r):
                L1.append((a, b, o))

Last create DataFrame constructor: 最后创建DataFrame构造函数：

df2 = pd.DataFrame(L1, columns=['adj','response_split','response_object'])
print (df2)
         adj                                     response_split  \
0  beautiful  [She's a beautiful girl/woman,  and also a goo...   
1  beautiful  [She's a beautiful girl/woman,  and also a goo...   
2       good  [She's a beautiful girl/woman,  and also a goo...   
3    hideous  [This city is hideous,  let's move to the coun...   

  response_object  
0            girl  
1           woman  
2         teacher  
3            city

在dataframe中查找列表中的元素

问题描述

1 个解决方案

解决方案1
3 已采纳 2019-07-11 05:29:59

在dataframe中查找列表中的元素

问题描述

1 个解决方案

解决方案1 3 已采纳 2019-07-11 05:29:59

解决方案1
3 已采纳 2019-07-11 05:29:59