[英]How to get a df with the common values in 2 different column df's?
I have 2 dfs with different len:我有 2 个具有不同 len 的 dfs:
df1:
ESTACION DZ
0 ALAMOR 1
1 EL TIGRE 1
2 SAN PEDRO 1
3 TABACONAS 1
4 BATAN 2
5 CACAO 2
6 CHOTANO 2
7 CIRATO 2
8 LLAUCANO 2
9 NARANJOS 2
10 MAGUNCHAL 2
11 PUCHACA 2
12 MAYGASBAMBA 2
df2:
Estacion Co Dre
56 ALAMOR C 1
89 LAGARTERA C 1
90 PUENTE PIURA C 1
211 PUENTE SULLANA C 1
249 PALTASHACO C 1
250 TAMBO GRANDE C 1
342 VENTANILLAS C 2
421 CACAO C 2
466 DESAGUADERO C 2
508 QUEBRADA HONDA C 2
I want to save in another df (df3) common values between df1['ESTACION'] and df2['Estacion'] so i tried this code:我想在 df1['ESTACION'] 和 df2['Estacion'] 之间保存另一个 df (df3) 常用值,所以我尝试了以下代码:
duplicates = pd.concat([df1,df2])[pd.concat([df1,df2])
.duplicated(subset=['ESTACION','Estacion'], keep=False)]
But i'm not getting the common values.但我没有得到共同的价值观。 I hope you can give me an answer or some advice.
希望你能给我一个答案或一些建议。 Thanks!!
谢谢!!
Edited to make the answer more specific to your situation编辑以使答案更适合您的情况
You can use merge
which by default does an inner join.您可以使用默认情况下进行内部连接的
merge
。 And if you insist in having a dataframe with strictly the common values of a single column, try this:如果你坚持让 dataframe 具有严格的单列公共值,试试这个:
df3=pd.merge(df1, df2, left_on=['ESTACION'], right_on=['Estacion'])
df3.drop(df3.columns.difference(['ESTACION']), 1, inplace=True)
If you want to get the common values regardless of where they appear and how many times, you can simply do this:如果您想获得公共值而不管它们出现在哪里以及出现多少次,您可以简单地执行以下操作:
common_values = list(set(np.unique(df1['ESTACION'].values)).intersection(set(np.unique(df2['Estacion'].values))))
You need to have ran previously import numpy as np
of course.当然,您需要之前运行
import numpy as np
。
This will give you a list of all the values which are found in both columns.这将为您提供在两列中找到的所有值的列表。 Then, you can assign them to a new DataFrame's column like so
df3['common_values'] = common_values
or do whatever else you want with those values.然后,您可以将它们分配给新的 DataFrame 列,例如
df3['common_values'] = common_values
或对这些值执行任何其他操作。
I think you want to merge the dataframes, like this:我认为您想合并数据框,如下所示:
df3 = pd.merge(df1,df2, left_on=['ESTACION'],right_on=['Estacion'])
as mentioned in the comments, the column ESTACION
is not the same as Estacion
如评论中所述,
ESTACION
列与Estacion
不同
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.