简体   繁体   English

Pandas - 根据与数据框中某个值匹配的系列索引,将系列中的值添加到数据框列

[英]Pandas - Add values from series to dataframe column based on index of series matching some value in dataframe

Data数据

pb = {"mark_up_id":{"0":"123","1":"456","2":"789","3":"111","4":"222"},"mark_up":{"0":1.2987,"1":1.5625,"2":1.3698,"3":1.3333,"4":1.4589}}
data = {"id":{"0":"K69","1":"K70","2":"K71","3":"K72","4":"K73","5":"K74","6":"K75","7":"K79","8":"K86","9":"K100"},"cost":{"0":29.74,"1":9.42,"2":9.42,"3":9.42,"4":9.48,"5":9.48,"6":24.36,"7":5.16,"8":9.8,"9":3.28},"mark_up_id":{"0":"123","1":"456","2":"789","3":"111","4":"222","5":"333","6":"444","7":"555","8":"666","9":"777"}}
pb = pd.DataFrame(data=pb).set_index('mark_up_id')
df = pd.DataFrame(data=data)

Expected Output预期产出

test = df.join(pb, on='mark_up_id', how='left')
test['cost'].update(test['cost'] + test['mark_up'])
test.drop('mark_up',axis=1,inplace=True)

Or..或者..

df['cost'].update(df['mark_up_id'].map(pb['mark_up']) + df['cost'])

Question

Is there a function that does the above, or is this the best way to go about this type of operation?是否有执行上述操作的功能,或者这是进行此类操作的最佳方法?

I would use the second solution you propose or better this:我会使用您提出的第二种解决方案或更好的解决方案:

df['cost']=(df['mark_up_id'].map(pb['mark_up']) + df['cost']).fillna(df['cost'])

I think using update can be uncomfortable because it doesn't return anything.我认为使用 update 可能不舒服,因为它不会返回任何东西。

Let's say Series.fillna is more flexible.假设Series.fillna更灵活。

We can also use DataFrame.assign in order to continue working on the DataFrame that the assignment returns.我们还可以使用DataFrame.assign继续处理分配返回的 DataFrame。

df.assign( Cost=(df['mark_up_id'].map(pb['mark_up']) + df['cost']).fillna(df['cost']) )

Time comparision with join methodjoin方法的时间比较

%%timeit
df['cost']=(df['mark_up_id'].map(pb['mark_up']) + df['cost']).fillna(df['cost'])
#945 µs ± 46 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit
test = df.join(pb, on='mark_up_id', how='left')
test['cost'].update(test['cost'] + test['mark_up'])
test.drop('mark_up',axis=1,inplace=True)
#3.59 ms ± 137 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

slow..减缓..


%%timeit
df['cost'].update(df['mark_up_id'].map(pb['mark_up']) + df['cost'])
#985 µs ± 32.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Finally,I recommend you see: Underastanding inplace and When I should use apply最后,我建议您查看: Underastanding inplaceWhen I should use apply

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从系列或字典向数据帧添加一个新列,将我的系列索引和数据帧列映射到键熊猫 python - Add a new column to a dataframe from a Series or dictionary mapping my series index and a dataframe column to key pandas python 将系列索引中的值添加到Pandas DataFrame中等值的行 - Add value from series index to row of equal value in Pandas DataFrame 如何将 Pandas 系列中的值添加到 Dataframe 列而不重复 - How to add values from a Pandas Series to a Dataframe column without duplicates 将 pandas 系列值添加到 pandas Z6A8064B5DF479455507DZ553 - add pandas series values to new dataframe column at end of pandas dataframe 熊猫向数据框列添加系列 - Pandas add a series to dataframe column Pandas,如何将系列添加到 DataFrame 列,其中系列索引与 DataFrame 列匹配? - Pandas, how to add Series to DataFrame column, where series index matches a DataFrame column? 将数据框列的匹配值与系列值相加 - Totalling the matching values of a dataframe column with Series values Pandas:将系列添加到数据框作为列(相同的索引,不同的长度) - Pandas: Add series to dataframe as a column (same index, different length) 将pandas Series作为列添加到多索引的DataFrame填充级别 - Add pandas Series as a column to DataFrame filling levels of multi-index 从系列/ dict中的匹配列更新pandas数据帧行值 - Update pandas dataframe row values from matching columns in a series/dict
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM