[英]Replace specific column values with another dataframe column value using Pandas
[英]Replace values in a dataframe with values from another dataframe when common value is found in specific column
我試圖取代hours
在df
與hours
從replacements
同時存在於兩個dataframes項目編號:
import pandas as pd
df = pd.DataFrame({
'project_ids': [1, 2, 3, 4, 5],
'hours': [111, 222, 333, 444, 555],
'else' :['a', 'b', 'c', 'd', 'e']
})
replacements = pd.DataFrame({
'project_ids': [2, 5, 3],
'hours': [666, 999, 1000],
})
for project in replacements['project_ids']:
df.loc[df['project_ids'] == project, 'hours'] = replacements.loc[replacements['project_ids'] == project, 'hours']
print(df)
但是,只有項目ID 3獲得正確的分配(1000),而項目2和5都獲得NaN
:
projects hours else
0 1 111.0 a
1 2 NaN b
2 3 1000.0 c
3 4 444.0 d
4 5 NaN e
將Series.map
與通過用DataFrame.set_index
replacements
創建的另一個Series
一起DataFrame.set_index
:
s = replacements.set_index('project_ids')['hours']
df['hours'] = df['project_ids'].map(s).fillna(df['hours'])
print(df)
project_ids hours else
0 1 111.0 a
1 2 666.0 b
2 3 1000.0 c
3 4 444.0 d
4 5 999.0 e
使用df.update()
另一種方式:
m=df.set_index('project_ids')
m.update(replacements.set_index('project_ids')['hours'])
print(m.reset_index())
project_ids hours else
0 1 111.0 a
1 2 666.0 b
2 3 1000.0 c
3 4 444.0 d
4 5 999.0 e
另一種解決方案是使用pandas.merge
然后使用fillna
:
df_new = pd.merge(df, replacements, on='project_ids', how='left', suffixes=['_1', ''])
df_new['hours'].fillna(df_new['hours_1'], inplace=True)
df_new.drop('hours_1', axis=1, inplace=True)
print(df_new)
project_ids else hours
0 1 a 111.0
1 2 b 666.0
2 3 c 1000.0
3 4 d 444.0
4 5 e 999.0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.