[英]Problem comparing 2 DataFrames, returns wrong result
There are 2 dfs 有2个dfs
df1 and df2 df1和df2
df1 contains:
account_id account_name
0 37469426 Name1
1 71508517 Name2
2 85304427 Name3
3 115964688 Name4
4 119853529 Name4
df2 contains:
account_id account_name
0 37469426 Name1
1 71508517 Name2
2 85304427 Name3
3 115964688 Name4
4 119853529 Name4
5 1111 Test
I want to compare them, in such way, that in df3 are the values from df1 which are not in df2 我想以这样的方式比较它们,即df3中的值是df1中的值,而不是df2中的值
In this case it should return nothing. 在这种情况下,它不应返回任何内容。
Datatypes are the same, columns are the same, the number of values differs. 数据类型相同,列相同,值的数量不同。
I've tried concat and merge, but the result is wrong. 我试过concat和合并,但结果是错误的。
merged = pd.merge(df1 , df2, on=['account_id', 'account_name'], how='right')
#returns:
account_id account_name
0 37469426 Name1
1 71508517 Name2
2 85304427 Name3
3 115964688 Name4
4 119853529 Name5
merged = pd.merge(df1 , df2, on=['account_id', 'account_name'], how='left')
#returns:
0 37469426 Name1
1 71508517 Name2
2 85304427 Name3
3 115964688 Name4
4 119853529 Name4
5 1111 Test
#inner / outer return everything
0 37469426 Name1
1 71508517 Name2
2 85304427 Name3
3 115964688 Name4
4 119853529 Name4
5 1111 Test
compare_ga_accounts = pd.concat([df1 , df2])
compare_ga_accounts.drop_duplicates(keep=False, inplace=True)
#returns:
account_id account_name
0 1111 Test
I have no idea why it happens like that(( 我不知道为什么会这样
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.