[英]extract data from one dataframe and add it to a second dataframe
I am a python beginner. 我是一个蟒蛇初学者。 I have two data frames,
df1
, df2
. 我有两个数据帧,
df1
, df2
。 there are some duplicates ( apple
, grape
), so I want to add the users info from df2
and add it to df1
, but only to the duplicates. 有一些重复(
apple
, grape
),所以我想从df2
添加用户信息并将其添加到df1
,但仅添加到重复项。 at the end i should have a new df1 with apple and grape rows with new data( i know i would have to create a new column 'USERS' in df1) Any help is appreciated 最后我应该有一个新的df1与苹果和葡萄行与新数据(我知道我将不得不在df1中创建一个新的'USERS'列)任何帮助表示赞赏
import pandas as pd
df1 = pd.DataFrame({'FRUIT':['banana','apple', 'grape'], 'COLOR':['yellow', 'red', 'green'], 'CAL':[100, 80, 100]})
df2 = pd.DataFrame({'FRUIT':['kiwi','melon', 'apple', 'grape', 'pineapple'], 'COLOR':['green', 'orange', 'red',\
'blue','yellow'], 'CAL':[60, 70, 80, 50, 80], 'USERS':[4, 7, 12, 20, 3]})
df = pd.concat([df1,df2], keys=['df1','df2'], sort=False)
col_val_to_add = df[df.duplicated(['FRUIT'])]
for i in df:
for j in col_val_to_add:
if df.loc[['FRUIT',i]]==df.loc[['FRUIT',j]]:
df.loc[['USERS',j]] = col_val_to_add.loc[['USERS',i]]
print(df)
What you are hoping to do is called a join (specifically a LEFT join). 你希望做的是一个连接(特别是一个LEFT连接)。 In pandas you can do this with merge:
在pandas中,您可以使用merge执行此操作:
new_df = df1.merge(df2[['FRUIT', 'USERS']], on="FRUIT", how="left")
Hope that helps. 希望有所帮助。
EDIT: Sorry I just re-read your questions. 编辑:对不起,我只是重新阅读你的问题。 If you want only the FRUIT that are in both dfs switch the left join to an inner join like this:
如果你只希望两个dfs中的FRUIT将左连接切换到这样的内部连接:
new_df = df1.merge(df2[['FRUIT', 'USERS']], on="FRUIT", how="inner")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.