2个数据框之间的字符串匹配

Question

Learning Python here, and any help on this is much appreciated. 在这里学习Python，对此深有帮助。 My problem scenario is, there are 2 dataframes A and B contains a column(Name and Flag) list of Names. 我的问题场景是，有2个数据框A和B包含名称的列（名称和标志）列表。

ExDF = pd.DataFrame({'Name' : ['Smith','John, Alex','Peter Lin','Carl Marx','Abhraham Moray','Calvin Klein'], 'Flag':['False','False','False','False','False','False']})

SnDF = pd.DataFrame({'Name' : ['Adam K ','John Smith','Peter Lin','Carl Josh','Abhraham Moray','Tim Klein'], 'Flag':['False','False','False','False','False','False']})

The initial value of Flag is False. Flag的初始值为False。

Point 1: I need to flip the names in both dataframe ie. 要点1：我需要在两个数据框中都翻转名称。 Adam Smith to Smith Adam and save the flip names in another new column in the both dataframes. 亚当·史密斯（Adam Smith）和史密斯·亚当（Smith Adam），并将翻转名称保存在两个数据框中的另一个新列中。 - This part is done. -这部分完成了。

Point 2: Then both the Original name and flip names of A dataframe should get check in B dataframe original names and flip names. 第2点：然后， A数据帧的原始名称和翻转名称都应签入B数据帧的原始名称和翻转名称。 If it found the the flag column in both the dataframe should get update by True. 如果找到两个数据帧中的标志列，则应通过True更新。

I wrote the code but it checks one on one row to both dataframe like A[0] to B[0] , A[1] to B[1] , but i need to check A[0] record to all the records of B dataframe. 我编写了代码，但是它同时检查了两个数据帧，如A[0]至B[0] ， A[1]至B[1] ，但我需要检查A[0]记录到的所有记录B数据框。

Pls help me on this!! 请帮助我！

The code which tried is below: 尝试的代码如下：

import numpy as np

import pandas as pd

from sklearn.feature_extraction.text import CountVectorizer

ExDF_swap = ExDF["Swap"] = ExDF["Name"].apply(lambda x: " ".join(reversed(x.split()))) 
SnDF_swap = SnDF["Swap"] = SnDF["Name"].apply(lambda x: " ".join(reversed(x.split()))) 
ExDF_swap =  pd.DataFrame(ExDF_swap)
SnDF_swap =  pd.DataFrame(SnDF_swap)

vect = CountVectorizer()
X = vect.fit_transform(ExDF_swap.Name)
Y = vect.transform(SnDF_swap.Name)

res = np.ravel(np.any((X.dot(Y.T) > 1).todense(), axis=1))
pd.DataFrame(X.toarray(), columns=vect.get_feature_names())
pd.DataFrame(Y.toarray(), columns=vect.get_feature_names())

ExDF["Flag"] = np.ravel(np.any((X.dot(Y.T) > 1).todense(), axis=1))
SnDF["Flag"] = np.ravel(np.any((X.dot(Y.T) > 1).todense(), axis=1))

Answer 1

You could try isin() - of pandas: 您可以尝试熊猫的isin() -：

import pandas as pd

ExDF = pd.DataFrame({'Name' : ['Smith','John, Alex','Peter Lin','Carl Marx','Abhraham Moray','Calvin Klein'], 'Flag':['False','False','False','False','False','False']})
SnDF = pd.DataFrame({'Name' : ['Adam K ','John Smith','Peter Lin','Carl Josh','Abhraham Moray','Tim Klein'], 'Flag':['False','False','False','False','False','False']})

print(ExDF)
print(SnDF)

ExDF["Swap"] = ExDF["Name"].apply(lambda x: " ".join(reversed(x.split())))
SnDF["Swap"] = SnDF["Name"].apply(lambda x: " ".join(reversed(x.split())))

print(ExDF)
print(SnDF)

ExDF['Flag'] = ExDF.Name.isin(SnDF.Name)
SnDF['Flag'] = SnDF.Name.isin(ExDF.Name)

print(ExDF)
print(SnDF)

2个数据框之间的字符串匹配

问题描述

1 个解决方案

解决方案1
0 2018-07-20 11:12:39

2个数据框之间的字符串匹配

问题描述

1 个解决方案

解决方案1 0 2018-07-20 11:12:39

解决方案1
0 2018-07-20 11:12:39