简体   繁体   English

如何根据总数的百分比拆分字段值

[英]How to split field value based on percentage of total

I have sum of transactions grouped by date_month , device and channel like so我有按date_monthdevicechannel分组的交易总和,如下所示

date_month   device            channel  transactions
2017-01-01  desktop         AFFILIATES           413
2017-01-01   mobile         AFFILIATES           501
2017-01-01    other         AFFILIATES            22
2017-01-01   tablet         AFFILIATES           250
2017-01-01  desktop             DIRECT         13979
etc...       etc...             etc...        etc...

date_month range is from 2017-01-01 to current date date_month 范围是从2017-01-01到当前日期

What I'm trying to do is split the device 's other field into either mobile , desktop or tablet我正在尝试做的是将deviceother领域拆分为mobiledesktoptablet

Example process:示例流程:

  • Pivot device 'other' with its value transactions as an extra column ( other_transactions ) Pivot 设备'other' ,其价值transactions作为额外列 ( other_transactions )
  • Take a total sum of transactions partitioned/grouped by date_month and channel ( total_transactions )获取按date_monthchannel ( total_transactions ) 分区/分组的transactions总数
  • Then divide transactions by total_transactions to get percent total ( percent_total )然后将transactions除以total_transactions以获得总百分比( percent_total
  • Multiply other_transactions and percent_total to get other_splitother_transactionsother_split相乘得到percent_total
  • Add other_split to transactions to get an updated transactions fieldother_split添加到transactions以获取更新的 transactions 字段

Getting the totals and applying simple math operations shouldn't be a problem.获取总数并应用简单的数学运算应该不是问题。 I would do something along the lines of df['total_transactions']=df.groupby(['date_month', 'channel'])['transactions'].transform('sum') to get total_transactions but the issue I'm having is getting the other transactions into a separate column like so我会按照df['total_transactions']=df.groupby(['date_month', 'channel'])['transactions'].transform('sum')的方式做一些事情来获得total_transactions但我遇到的问题拥有正在将other交易放入单独的列中,就像这样

date_month   device            channel  transactions  other_trans
2017-01-01  desktop         AFFILIATES           413           22
2017-01-01   mobile         AFFILIATES           501           22
2017-01-01   tablet         AFFILIATES           250           22
2017-01-01  desktop             DIRECT         13979          etc
etc...       etc...             etc...        etc...

In the end, I would like to have a data frame that removes other devices from the device column and uses its transactions to increase the remaining device transactions based on their share of transactions for that date_month and channel最后,我希望有一个数据框,它从device列中删除other设备,并使用其交易来根据他们在该date_monthchannel的交易份额来增加剩余的设备交易

IIUC, you can first create another dataframe using groupby , drop the rows with others , and then perform a merge : IIUC,您可以先使用groupby创建另一个 dataframe ,将行与others行一起删除,然后执行merge

import pandas as pd

df = pd.DataFrame({'date_month': {0: '2017-01-01', 1: '2017-01-01', 2: '2017-01-01', 3: '2017-01-01', 4: '2017-01-01', 5:"2017-01-01"},
                   'device': {0: 'desktop', 1: 'mobile', 2: 'other', 3: 'tablet', 4: 'desktop', 5:"other"},
                   'channel': {0: 'AFFILIATES', 1: 'AFFILIATES', 2: 'AFFILIATES', 3: 'AFFILIATES', 4: 'DIRECT', 5: 'DIRECT'},
                   'transactions': {0: 413, 1: 501, 2: 22, 3: 250, 4: 13979, 5: 234}})

other = df.groupby("device").get_group("other")[["date_month","channel","transactions"]]

df = df.drop(df[df["device"].str.contains("other")].index)

df = df.merge(other, on=["date_month","channel"], how="left", suffixes=("","_other"))

print (df)

Result:结果:

   date_month   device     channel  transactions  transactions_other
0  2017-01-01  desktop  AFFILIATES           413                  22
1  2017-01-01   mobile  AFFILIATES           501                  22
2  2017-01-01   tablet  AFFILIATES           250                  22
3  2017-01-01  desktop      DIRECT         13979                 234

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据具有相同字段值的对象总数在 model 中添加自动增量字段? - How can I add an auto increment field in a model based on total count of objects with same field value? 根据总的值比例拆分 pandas 列 - split a pandas column based on value proportion from the total 如何在python中获取行基础行中每个值的百分比 - How to get the percentage of each value in a row basis row total in python 如何基于给定值获取列的百分比 - How to get percentage of a column based on a given value 基于列值的百分比 - Percentage based on column value 根据总百分比选择 pandas dataframe 中的组 - Selecting groups in pandas dataframe based on percentage of total 如何计算占总数的百分比,当总数包括百分比时——循环计算 - How to calculate the percentage of the total, when the total includes the percentage -circular calculations Python计算字典中的总值和百分比 - Python calculate total value and percentage of it in dictionary 如何在pandas中使用groupby根据另一列中的条件计算百分比/比例总数 - How to use groupby in pandas to calculate a percentage / proportion total based on a criteria in another column 如何计算实际总字段的总值并将此值存储在 odoo 中 model 的另一个字段中? - How can I calculate total value of actual total field and store this value in another field in model in odoo?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM