熊貓-如何為每個組計算列中的每個值等於或小於該值的百分比

Question

可以說我有一個這樣的數據框

x = pd.DataFrame({'person':['a','b']*5 , 'rating':[1,3,4,2,4,2,3,4,5,3]})

現在，我想為每個人計算每個等級的“偏好分數”。 現在，我將評級r的偏好得分定義為

freq of rating where rating <=r  -  freq of rating where rating ==r

例如，a具有以下等級

現在，例如，某人a評級= 4

freq of rating where rating <=4  :  4/5 
freq of rating where rating ==4   : 2/5

所以偏好分數是2/5

如何為該數據幀上的每個記錄獲取優先級分數。 編輯：也許這使它更清晰

   person rating    pref_score
        a   1       0.0
        a   4       0.4
        a   4       0.4
        a   3       0.2
        a   5       0.8

Answer 1

所以你需要這樣的東西嗎？

x.groupby('person').rating.apply(lambda x : (sum(x<=4)-sum(x==4))/len(x))
Out[7]: 
person
a    0.4
b    0.8
Name: rating, dtype: float64

還是transform ？

x.groupby('person').rating.transform(lambda x : (sum(x<=4)-sum(x==4))/len(x))
Out[8]: 
0    0.4
1    0.8
2    0.4
3    0.8
4    0.4
5    0.8
6    0.4
7    0.8
8    0.4
9    0.8
Name: rating, dtype: float64

編輯：

x=x.sort_values('person')
x['ref']=x.groupby('person').rating.apply(lambda y : [(sum(y<=x)-sum(y==x))/len(y) for x in y]).apply(pd.Series).stack().values
x
Out[25]: 
  person  rating  ref
0      a       1  0.0
2      a       4  0.4
4      a       4  0.4
6      a       3  0.2
8      a       5  0.8
1      b       3  0.4
3      b       2  0.0
5      b       2  0.0
7      b       4  0.8
9      b       3  0.4

由於您使用的是python 2.7

x['map']=x.person.map(x.groupby('person').rating.apply(list))
x.apply(lambda x : sum(x['rating']<np.array(x['map']))/len(x['map']),1 )

Answer 2

您可以執行以下操作：

>> x.groupby("person").rating.apply(lambda x: x[x <= 4].count())
person
a    4
b    5

和

>> x.groupby("person").rating.apply(lambda x: x[x == 4].count())
person
a    2
b    1

熊貓-如何為每個組計算列中的每個值等於或小於該值的百分比

問題描述

2 個解決方案

解決方案1
1 2017-12-12 03:01:47

解決方案2
0 2017-12-12 02:26:38

熊貓-如何為每個組計算列中的每個值等於或小於該值的百分比

問題描述

2 個解決方案

解決方案1 1 2017-12-12 03:01:47

解決方案2 0 2017-12-12 02:26:38

解決方案1
1 2017-12-12 03:01:47

解決方案2
0 2017-12-12 02:26:38