[英]Arithmetic operation on a groupby pandas dataframe
我有一個40列和400000行的熊貓數據框。 我在3列上創建了一個匯總數據集。
現在,我需要根據其中兩列來計算百分比指標。 Python引發錯誤-
unsupported operand type(s) for /: 'SeriesGroupBy' and 'SeriesGroupBy'
這是示例代碼:
print sample_data
date part receipt bad_dollars total_dollars bad_percent
0 1 123 22 40 100 NaN
1 2 456 44 80 120 NaN
2 3 134 33 30 150 NaN
3 1 123 22 80 100 NaN
4 5 456 45 40 90 NaN
5 3 134 33 85 150 NaN
6 7 123 24 70 120 NaN
7 5 456 45 20 85 NaN
8 9 134 35 50 300 NaN
9 7 123 24 300 600 NaN
sample_data_group = sample_data.groupby(['date','part','receipt'])
sample_data_group['bad_percents']=sample_data_group['bad_dollars']/sample_data_group['total_dollars']
TypeError: unsupported operand type(s) for /: 'SeriesGroupBy' and 'SeriesGroupBy'
請幫忙!
您可以對groupby對象應用apply來執行此操作:
import pandas as pd
import numpy as np
cols = ['index', 'date', 'part', 'receipt', 'bad_dollars', 'total_dollars',
'bad_percent']
sample_data = pd.DataFrame([
[0, 1, 123, 22, 40, 100, np.nan],
[1, 2, 456, 44, 80, 120, np.nan],
[2, 3, 134, 33, 30, 150, np.nan],
[3, 1, 123, 22, 80, 100, np.nan],
[4, 5, 456, 45, 40, 90, np.nan],
[5, 3, 134, 33, 85, 150, np.nan],
[6, 7, 123, 24, 70, 120, np.nan],
[7, 5, 456, 45, 20, 85, np.nan],
[8, 9, 134, 35, 50, 300, np.nan],
[9, 7, 123, 24, 300, 600, np.nan]],
columns = cols).set_index('index', drop = True)
sample_data_group = sample_data.groupby(['date','part','receipt'])
xx = sample_data_group.apply(
lambda x: x.assign(bad_percent = x.bad_dollars/x.total_dollars))\
.reset_index(['date','part', 'receipt'], drop = True)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.