简体   繁体   English

如果列表中的项目在另一个列表中,我如何签入 Pandas?

[英]How can I check in Pandas if an item from a list is in an another list?

I have a two different pandas Dataframe我有两个不同的熊猫数据框

df_1 with columns id(int), name(string), description(string) df_1 列 id(int)、name(string)、description(string)

and df_2 with columns id(int), name(string), description(string)和 df_2 列 id(int)、name(string)、description(string)

The names from df_1 and df_2 are only similar but not the same and I would like to connect both data frames with id of df_1. df_1 和 df_2 的名称只是相似但不相同,我想将两个数据帧与 df_1 的 id 连接起来。

I have created a new column for both dataframes called splitted_name with a list of words from name column.我为两个数据框创建了一个名为 splitted_name 的新列,其中包含来自 name 列的单词列表。

Now I would like to check if at least one element from df_1.splitted_name is in df_2.splitted_name.现在我想检查 df_1.splitted_name 中的至少一个元素是否在 df_2.splitted_name 中。 How can I get this done in Pandas?如何在 Pandas 中完成这项工作?

sample data:样本数据:

df_1

    name                       name_split
1   Alone in the jungle       ['alone','in','the','jungle']
2   Born by the sea           ['born','by','the','sea']

df_2


1   Goodbye my love           ['goodbye','my','love']
2   Alone in the jungle remastered ['alone','in','the','jungle','remastered']

You should first join them to one Data frame and then try this.您应该首先将它们加入一个数据框,然后尝试此操作。 I have made my own example with these datasets:我用这些数据集做了我自己的例子:

df1 = pd.DataFrame(data=[['John Black'], ['Sara Smith'], ['Jane Jane']], columns=['name'])
df2 = pd.DataFrame(data=[['John Smith'], ['Sara Midname Smith'], ['Emma Sunshine']], columns=['name'])
df1['splitted_name'] = df1.name.str.split(' ')
df2['splitted_name'] = df2.name.str.split(' ')

Create data frame with all possible combinations:创建具有所有可能组合的数据框:

df = []
for i in df1.values:
    for j in df2.values:
        df.append(i.tolist()+j.tolist())
df = pd.DataFrame(df)
df.columns = ['name1','splitted_name1', 'name2','splitted_name2']

And finally compare splitting names:最后比较拆分名称:

result = df.apply(lambda x: (pd.Index(pd.unique(x.splitted_name1)).get_indexer(x.splitted_name2) >= 0).any(), 1)

Output:输出:

0     True
1    False
2    False
3     True
4     True
5    False
6    False
7    False
8    False
Name: result, dtype: bool

Also you can use it as a new column in the Data frame:您也可以将其用作数据框中的新列:

df['result'] = result

And then filter rows you need:然后过滤您需要的行:

df = df[df.result]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何检查列表中的项目是否在另一个列表中? - How do I check if an item from a list is in another list? 如何检查一个列表中的项目是否可以作为另一个列表中的项目找到 - how to check if items in a list can be found as an item in another list 如何检查列表中的每个项目是否出现在另一个列表中的任何项目中? - How Can I Check if Every Item in a List Appears Within Any Items in Another List? 如何检查我的清单中是否有另一个清单(词典)中的项目? - How to check if my list has an item from another list(dictionary)? 如何检查列表项是否在另一个列表中 - How to check if list item is present on another list 检查 Pandas 列列表中的项目是否包含来自另一列的值 - Check if item from a Pandas column list contains value from another column 我怎样才能遍历一个列表并检查一个相等的项目是否包含在那里? - How i can i traverse a list and check if an equal item contains there? 如何检查列表中一行的任何部分是否包含另一个列表的整行? PYTHON - How can I check if any part of a line from a list contains the full line of another list? PYTHON 如何将一个列表中的项目追加到另一个列表? - How to append a item from a list to another list? python pandas:如何检查列项是否也是列表项 - python pandas: How to check if a column item is also item of a list
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM