簡體   English   中英

Pandas - 根據與數據框中某個值匹配的系列索引,將系列中的值添加到數據框列

[英]Pandas - Add values from series to dataframe column based on index of series matching some value in dataframe

數據

pb = {"mark_up_id":{"0":"123","1":"456","2":"789","3":"111","4":"222"},"mark_up":{"0":1.2987,"1":1.5625,"2":1.3698,"3":1.3333,"4":1.4589}}
data = {"id":{"0":"K69","1":"K70","2":"K71","3":"K72","4":"K73","5":"K74","6":"K75","7":"K79","8":"K86","9":"K100"},"cost":{"0":29.74,"1":9.42,"2":9.42,"3":9.42,"4":9.48,"5":9.48,"6":24.36,"7":5.16,"8":9.8,"9":3.28},"mark_up_id":{"0":"123","1":"456","2":"789","3":"111","4":"222","5":"333","6":"444","7":"555","8":"666","9":"777"}}
pb = pd.DataFrame(data=pb).set_index('mark_up_id')
df = pd.DataFrame(data=data)

預期產出

test = df.join(pb, on='mark_up_id', how='left')
test['cost'].update(test['cost'] + test['mark_up'])
test.drop('mark_up',axis=1,inplace=True)

或者..

df['cost'].update(df['mark_up_id'].map(pb['mark_up']) + df['cost'])

是否有執行上述操作的功能,或者這是進行此類操作的最佳方法?

我會使用您提出的第二種解決方案或更好的解決方案:

df['cost']=(df['mark_up_id'].map(pb['mark_up']) + df['cost']).fillna(df['cost'])

我認為使用 update 可能不舒服,因為它不會返回任何東西。

假設Series.fillna更靈活。

我們還可以使用DataFrame.assign繼續處理分配返回的 DataFrame。

df.assign( Cost=(df['mark_up_id'].map(pb['mark_up']) + df['cost']).fillna(df['cost']) )

join方法的時間比較

%%timeit
df['cost']=(df['mark_up_id'].map(pb['mark_up']) + df['cost']).fillna(df['cost'])
#945 µs ± 46 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit
test = df.join(pb, on='mark_up_id', how='left')
test['cost'].update(test['cost'] + test['mark_up'])
test.drop('mark_up',axis=1,inplace=True)
#3.59 ms ± 137 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

減緩..


%%timeit
df['cost'].update(df['mark_up_id'].map(pb['mark_up']) + df['cost'])
#985 µs ± 32.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

最后,我建議您查看: Underastanding inplaceWhen I should use apply

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM