简体   繁体   中英

Disaggregate pandas data frame using ratios from another data frame

I have a pandas data frame 'High' as

     segment     sales
     Milk        10
     Chocolate   30

and another data frame 'Low' as

    segment     sku    sales 
     Milk       m2341  2
     Milk       m235   3 
     Chocolate  c132   2
     Chocolate  c241   5
     Chocolate  c891   3

I want to use the ratios from Low to disaggregate High. So my resulting data here would be

    segment     sku    sales 
     Milk       m2341  4
     Milk       m235   6 
     Chocolate  c132   6
     Chocolate  c241   15
     Chocolate  c891   9

First, I would find the scale we need to multiple each product sales.

df_agg = df_low[["segment", "sales"]].groupby(by=["segment"]).sum().merge(df_high, on="segment")
df_agg["scale"] = df_agg["sales_y"] / df_agg["sales_x"]

Then, apply the scale

df_disagg_high = df_low.merge(df_agg[["segment", "scale"]])
df_disagg_high["adjusted_sale"] = df_disagg_high["sales"] * df_disagg_high["scale"]

If needed, you can exclude extra columns.

Try:

df_low["sales"] = df_low.sales.mul(
    df_low.merge(
        df_high.set_index("segment")["sales"].div(
            df_low.groupby("segment")["sales"].sum()
        ),
        on="segment",
    )["sales_y"]
).astype(int)
print(df_low)

Prints:

     segment    sku  sales
0       Milk  m2341      4
1       Milk   m235      6
2  Chocolate   c132      6
3  Chocolate   c241     15
4  Chocolate   c891      9

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM