I have the three following dataframes:
df_A = pd.DataFrame( {'id_A': [1, 1, 1, 1, 2, 2, 3, 3],
'Animal_A': ['cat','dog','fish','bird','cat','fish','bird','cat' ]})
df_B = pd.DataFrame( {'id_B': [1, 2, 2, 3, 4, 4, 5],
'Animal_B': ['dog','cat','fish','dog','fish','cat','cat' ]})
df_P = pd.DataFrame( {'id_A': [1, 1, 2, 3],
'id_B': [2, 3, 4, 5]})
df_A
id_A Animal_A
0 1 cat
1 1 dog
2 1 fish
3 1 bird
4 2 cat
5 2 fish
6 3 bird
7 3 cat
df_B
id_B Animal_B
0 1 dog
1 2 cat
2 2 fish
3 3 dog
4 4 fish
5 4 cat
6 5 cat
df_P
id_A id_B
0 1 2
1 1 3
2 2 4
3 3 5
And I would like to get an additional column to df_P that tells the number of Animals shared between id_A and id_B. What I'm doing is:
df_P["n_common"] = np.nan
for i in df_P.index.tolist():
id_A = df_P["id_A"][i]
id_B = df_P["id_B"][i]
df_P.iloc[i,df_P.columns.get_loc('n_common')] = len(set(df_A['Animal_A'][df_A['id_A']==id_A]).intersection(df_B['Animal_B'][df_B['id_B']==id_B]))
The result being:
df_P
id_A id_B n_common
0 1 2 2.0
1 1 3 1.0
2 2 4 2.0
3 3 5 1.0
Is there a faster, more pythonic, way to do this? Is there a way to avoid the for loop?
Not sure if it is faster or more pythonic, but it avoids the for loop :)
import pandas as pd
df_A = pd.DataFrame( {'id_A': [1, 1, 1, 1, 2, 2, 3, 3],
'Animal_A': ['cat','dog','fish','bird','cat','fish','bird','cat' ]})
df_B = pd.DataFrame( {'id_B': [1, 2, 2, 3, 4, 4, 5],
'Animal_B': ['dog','cat','fish','dog','fish','cat','cat' ]})
df_P = pd.DataFrame( {'id_A': [1, 1, 2, 3],
'id_B': [2, 3, 4, 5]})
df = pd.merge(df_A, df_P, on='id_A')
df = pd.merge(df_B, df, on='id_B')
df = df[df['Animal_A'] == df['Animal_B']].groupby(['id_A', 'id_B'])['Animal_A'].count().reset_index()
df.rename({'Animal_A': 'n_common'},inplace=True,axis=1)
您可以尝试以下方法:
df_A.merge(df_B, left_on = ['Animal_A'], right_on = ['Animal_B'] ).groupby(['id_A' ,'id_B']).count().reset_index().merge(df_P).drop('Animal_B', axis = 1).rename(columns = {'Animal_A': 'count'})
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.