[英]Getting the non common elements from the two columns in two dataframes
data1 = {'date': ['1998', '1999','1999','2000','1999'],
'node1': [1,1,2,3,3],
'node2': [3,4,3,4,8],
'weight': [1,1,1,1,1], }
df1 = pd.DataFrame(data1, columns = ['date', 'node1','node2','weight'])
data2 = {'date': ['2002','2001','2003','2002','2002','2001'],
'node1': [1,1,1,2,2,3],
'node2': [2,3,4,3,5,4],
'weight': [1,1,1,1,1,1], }
df2= pd.DataFrame(data2, columns = ['date', 'node1','node2','weight'])
I would like to search both the columns in each dataframe and then output the row which has the non common element in it.我想搜索每个数据框中的两列,然后输出其中包含非公共元素的行。
Output for this data would be:此数据的输出将是:
dataframe1: 3 8 1999数据框 1:3 8 1999
dataframe2: 2 5 2002数据框 2:2 5 2002
Output explanation-By searching across the two rows in the two dataframes we find that 5 and 8 are the non common elements therefore the row containing them are printed.输出说明 - 通过搜索两个数据帧中的两行,我们发现 5 和 8 是非公共元素,因此打印包含它们的行。
Edit-Data corrected.编辑数据更正。
So this probably isn't the best answer but it works:所以这可能不是最好的答案,但它有效:
dfs = pd.concat([df1, df2])
nodes = pd.concat([dfs['node1'], dfs['node2']])
counts = nodes.value_counts()
unique = []
for index, value in zip(counts.index, counts.tolist()):
if value == 1:
unique.append(index)
unique_df1 = df1[(df1['node1'].isin(unique)) | ((df1['node2'].isin(unique)))]
unique_df2 = df2[(df2['node1'].isin(unique)) | ((df2['node2'].isin(unique)))]
print(unique_df1)
print(unique_df2)
Output:输出:
date node1 node2 weight
4 1999 3 8 1
date node1 node2 weight
4 2002 2 5 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.