简体   繁体   English

如何根据条件减去 pandas df 中的列

[英]How to substract columns in pandas df based on condition

I have a dataset which looks like this.我有一个看起来像这样的数据集。 In my new dataset, I want to subtract the amount column(s) with principal column(s) and remainder(s) column.在我的新数据集中,我想用主列和余数列减去金额列。

For instance, if the amount column is 4, the principal column is 2 and remainder is 3, then the first amount column must be subtracted from the first principal column and first remainder column, 2nd with 2nd principal column and 2nd remainder column and 3rd with 3rd remainder column (since now there is no more principal column).例如,如果amount列是 4, principal列是 2, remainder是 3,则必须从第一个主列和第一个余数列中减去第一个金额列,第 2 个与第 2 个主列和第 2 个余数列相减,第 3 个与第三个余数列(因为现在没有更多的主列)。 And the last amount4 column must stay as it is as newamount4最后一个 amount4 列必须保持原样 newamount4

amount1  amount2   amount3 amount4  principal1  principal2  remainder1  remainder2    remainder3  
 100      250       150    100           250       100         80         100          100 
 200      200       350    25            450       100        120         100          50
 300      150       450    30            200       100        150         100          100
 250      550       550    100           100       200         50         500          200
 550      200       650    200          250       200        500         100          500

My new dataset must look like this.我的新数据集必须如下所示。 Please note am stands for amount and pr stands for principal and rem stands for remainder .请注意am代表金额pr代表本金, rem代表余数

newamount1          newamount2         newamount3     newamount4       
-230(am1-pr1-rem1)  50(am2-pr2-rem2)  50(am3-rem3)    amount4        
-370                0                 300             amount4        
 50                 50                350             amount4        
 100               -150               350             amount4        
-200               -100               150             amount4

You can use defaultdict to group common suffixes, then apply a reducing function ( np.subtract.reduce ) to get your output:您可以使用defaultdict对常见后缀进行分组,然后应用减少 function ( np.subtract.reduce ) 来获得 output:

from collections import defaultdict

mapping = defaultdict(list)
for column in df:
    if column[-1] != 4:
        mapping[f"newamount{column[-1]}"].append(df[column])
    else:
        mapping[f"newamount{column[-1]}"].append(column)

mapping = {
    key: np.subtract.reduce(value) if "4" not in key else "amount4"
    for key, value in mapping.items()
}

pd.DataFrame(mapping)

    newamount1  newamount2  newamount3  newamount4
0   -230        50          50          amount4
1   -370        0           300         amount4
2   -50        -50          350         amount4
3   100       -150          350         amount4
4   -200     -100           150         amount4

You could also iterate through a groupby:您还可以遍历 groupby:

mapping = {
    f"newamount{key}": frame.agg(np.subtract.reduce, axis=1)
    for key, frame in df.groupby(df.columns.str[-1], axis=1)
}

pd.DataFrame(mapping).assign(newamount4="amount4")

You may use the code below and adapt it if your data goes beyond 4 :如果您的数据超过4 ,您可以使用下面的代码并对其进行调整:

You can use pivot_longer function from pyjanitor to reshape the data before grouping and aggregating;您可以使用pyjanitor中的 pivot_longer function在分组和聚合之前重塑数据; at the moment you have to install the latest development version from github :目前您必须从github安装最新的开发版本:

 # install latest dev version
# pip install git+https://github.com/ericmjl/pyjanitor.git
 import janitor

(
    df.pivot_longer(names_to=".value", 
                    names_pattern=".+(\d)$", 
                    ignore_index=False)
    .fillna(0)
    .add_prefix("newamount")
    .groupby(level=0)
    .agg(np.subtract.reduce)
    .assign(newamount4="amount4") # edit your preferred column
)

Sticking to functions within Pandas only, we can reshape the data by stacking, before grouping and aggregating:仅使用 Pandas 中的函数,我们可以在分组和聚合之前通过堆叠来重塑数据:

df.columns = df.columns.str.split("(\d)", expand=True).droplevel(-1)
(
    df.stack(0)
    .fillna(0)
    .droplevel(-1)
    .groupby(level=0)
    .agg(np.subtract.reduce)
    .add_prefix("newamount")
    .assign(newamount4="amount4")
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM