如何根據總數的百分比拆分字段值

Question

我有按date_month 、 device和channel分組的交易總和，如下所示

date_month   device            channel  transactions
2017-01-01  desktop         AFFILIATES           413
2017-01-01   mobile         AFFILIATES           501
2017-01-01    other         AFFILIATES            22
2017-01-01   tablet         AFFILIATES           250
2017-01-01  desktop             DIRECT         13979
etc...       etc...             etc...        etc...

date_month 范圍是從2017-01-01到當前日期

我正在嘗試做的是將device的other領域拆分為mobile 、 desktop或tablet

示例流程：

Pivot 設備'other' ，其價值transactions作為額外列 ( other_transactions )
獲取按date_month和channel ( total_transactions ) 分區/分組的transactions總數
然后將transactions除以total_transactions以獲得總百分比（ percent_total ）
將other_transactions和other_split相乘得到percent_total
將other_split添加到transactions以獲取更新的 transactions 字段

獲取總數並應用簡單的數學運算應該不是問題。 我會按照df['total_transactions']=df.groupby(['date_month', 'channel'])['transactions'].transform('sum')的方式做一些事情來獲得total_transactions但我遇到的問題擁有正在將other交易放入單獨的列中，就像這樣

date_month   device            channel  transactions  other_trans
2017-01-01  desktop         AFFILIATES           413           22
2017-01-01   mobile         AFFILIATES           501           22
2017-01-01   tablet         AFFILIATES           250           22
2017-01-01  desktop             DIRECT         13979          etc
etc...       etc...             etc...        etc...

最后，我希望有一個數據框，它從device列中刪除other設備，並使用其交易來根據他們在該date_month和channel的交易份額來增加剩余的設備交易

Answer 1

IIUC，您可以先使用groupby創建另一個 dataframe ，將行與others行一起刪除，然后執行merge ：

import pandas as pd

df = pd.DataFrame({'date_month': {0: '2017-01-01', 1: '2017-01-01', 2: '2017-01-01', 3: '2017-01-01', 4: '2017-01-01', 5:"2017-01-01"},
                   'device': {0: 'desktop', 1: 'mobile', 2: 'other', 3: 'tablet', 4: 'desktop', 5:"other"},
                   'channel': {0: 'AFFILIATES', 1: 'AFFILIATES', 2: 'AFFILIATES', 3: 'AFFILIATES', 4: 'DIRECT', 5: 'DIRECT'},
                   'transactions': {0: 413, 1: 501, 2: 22, 3: 250, 4: 13979, 5: 234}})

other = df.groupby("device").get_group("other")[["date_month","channel","transactions"]]

df = df.drop(df[df["device"].str.contains("other")].index)

df = df.merge(other, on=["date_month","channel"], how="left", suffixes=("","_other"))

print (df)

結果：

   date_month   device     channel  transactions  transactions_other
0  2017-01-01  desktop  AFFILIATES           413                  22
1  2017-01-01   mobile  AFFILIATES           501                  22
2  2017-01-01   tablet  AFFILIATES           250                  22
3  2017-01-01  desktop      DIRECT         13979                 234

如何根據總數的百分比拆分字段值

問題描述

1 個解決方案

解決方案1
1 已采納 2019-10-02 08:52:04

如何根據總數的百分比拆分字段值

問題描述

1 個解決方案

解決方案1 1 已采納 2019-10-02 08:52:04

解決方案1
1 已采納 2019-10-02 08:52:04